Sync master frok branch & Handle GAM order approval permission denied gracefully#17
Merged
Conversation
* fix(#365): allow AI/Logfire test-connection probes on embedded tenants (#369)
The "Test Connection" action for AI providers and Logfire on the
Integrations tab failed with "Failed: embedded_writes_not_permitted"
on embedded tenants, because the verb-based embedded-write gate
classifies any POST under `require_tenant_access` as a mutation.
`test_ai_connection` and `test_logfire_connection` are read-only
probes — they validate credentials against the upstream provider and
never write tenant state. Opt them into `allow_embedded_writes=True`;
the model-layer guard in `embedded_tenant_guard.py` remains in force
as defense-in-depth.
Also fix the test-result handlers in `templates/tenant_settings.html`
to render `data.message || data.error` instead of `data.error` alone.
Gate envelopes (and any future role-gate rejections) return both a
stable code in `error` and a human-readable string in `message`; the
old code surfaced the stable code, which read as gibberish to users.
Sweep finding (left as follow-up): the same verb-based-gate trap
exists on `tenants.test_slack`, `adapters.test_freewheel_connection`,
`adapters.test_triton_connection`, `adapters.test_broadstreet_connection`,
and `settings.test_domain_access`. Each is a read-only probe that
could opt in with the same flag.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#363): unblock Policies & Workflows writes on embedded tenants (#370)
On embedded tenants every field in the Policies & Workflows tab
(Brand Manifest Policy, Naming Conventions, Approval Workflows,
Measurement Providers, Product Ranking, Auto-approval thresholds)
silently reverted on save. Two compounding bugs:
1. Route blocked at the boundary. `/settings/business-rules` POST
used `@require_tenant_access(role=("admin",))` without
`allow_embedded_writes=True`, so the verb-based gate returned
403 `embedded_writes_not_permitted` before the handler ran.
2. JS treated the 403 HTML error page as success. `saveBusinessRules`
in `tenant_settings.js` content-type-branched: any HTML response
with no `.flash-messages` container fell through to
`window.location.reload()`. Flask's default 403 error page has no
flash messages → reload-as-success → user sees their fields revert
with no error. Affected every 4xx/5xx on that route.
Fix three layers:
- Add `allow_embedded_writes=True` to `update_business_rules`. Per
Sprint 5 design (`docs/design/embedded-mode-sprint-5.md` §"Pattern:
shared business logic with the UI"), business rules are
publisher-managed and edited via the proxied admin UI; the
management API exposes the same writes.
- Add the per-column business-rules surface to
`PUBLISHER_WRITABLE_FIELDS[Tenant]` (13 fields covering naming
templates, approval mode, creative review settings, AI policy,
advertising policy, brand manifest policy, product ranking prompt,
human review flag). Platform-identity columns (name, billing_plan,
is_active, subdomain, external_*) stay locked.
- Add `gam_manual_approval_required` / `mock_manual_approval_required`
to `PUBLISHER_WRITABLE_FIELDS[AdapterConfig]` — these mirror
`tenant.human_review_required` onto adapter config and are written
by the same handler.
- Restructure `saveBusinessRules` to check `response.ok` BEFORE
content-type branching. Non-2xx responses now surface the error
(parsing flash messages from HTML when available, falling back to
the status code) instead of silently reloading.
Added four guard tests in `test_managed_tenant_api.py::TestWriteGuard`:
business-rules columns write, manual-approval adapter columns write,
platform-identity columns stay blocked, and an end-to-end check via the
mock adapter sync field.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#364): explain empty Allowed Principals dropdown on embedded tenants (#371)
* fix(#364): explain empty Allowed Principals dropdown on embedded tenants
On embedded tenants the "Allowed Principals (Advertisers)" multi-select
on Create Product rendered only "No principals configured" — a dead
end. The Buyer Agents section in Settings hides the "Add Buyer Agent"
button on embedded tenants (this is correct: Principal provisioning is
platform-managed via the Tenant Management API), so publishers had no
path to populate the dropdown.
Two compounding things made the UI misleading:
1. The empty-state placeholder didn't distinguish embedded from open
instances. Publishers saw the same "No principals configured" text
that suggests they can fix it themselves.
2. Comments in `tenant_settings.html` and `buyer_advertiser_routing.py`
claimed Principals are "auto-created on first request by the
embedded-mode auth bypass, which reads X-Identity-Buyer-Principal-Id".
That mechanism does not exist — grep `src/` for the header returns
zero matches. Anyone tracing the empty dropdown ran into a dead-end
comment that confidently pointed at a code path that isn't there.
Fix:
- In `add_product.html` and `add_product_gam.html`, replace the
disabled `<option>No principals configured</option>` with a
context-aware empty state. Embedded tenants get an explainer that
Principals are provisioned by the platform via the Tenant Management
API; open instances get a pointer to Settings → Buyer Agents.
- Rewrite the misleading comment block in `tenant_settings.html`
around the advertisers section and the user-visible "auto-created
from request headers" line — state plainly that embedded Principal
provisioning goes through the platform API.
- Fix the matching dead-pointer comment in
`buyer_advertiser_routing.py` near the access-grant logic.
Option B (platform-managed) per `docs/design/embedded-mode-sprint-5.md`
contract. Option A (re-enable UI authoring) would have been a
write-guard expansion that contradicts the existing
`{% if not embedded_view %}` gate on "Add Buyer Agent" — and the model
guard doesn't list Principal at all, so it's the UI gate alone holding
the line. Not the right place to flip the contract.
Terminology cleanup ("Allowed Principals" vs "Buyer Agents" vs
"Advertisers") is deliberately left for a follow-up issue — that's a
larger UX project than a bug fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(#364): update assertions to match corrected embedded-mode copy
The original test asserted on the misleading "auto-created from request
headers" copy that #364 removed (because the auto-create mechanism does
not exist — see #364 PR description). Update the assertions to match
the new, accurate copy that explains platform-API provisioning.
Also refresh the class docstring to drop the same misleading claim about
header-based auto-creation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#374): coerce update_media_buy status to wire enum at response boundary (#375)
The manual-approval path on ``update_media_buy`` read ``MediaBuy.status``
straight from the DB column and surfaced it on the
``UpdateMediaBuySuccess`` response. The persisted column accepts a
broader set than the AdCP wire enum — ``draft`` (model default) and
``pending_approval`` (manual-approval create path) are both valid in
storage but not in ``MediaBuyStatus``. fastmcp's request-/response-side
Pydantic validation rejected the response with
``INVALID_REQUEST[status]: Input should be 'pending_creatives',
'pending_start', 'active', 'paused', 'completed', 'rejected' or
'canceled'``, which surfaced as an E2E failure on
``test_complete_campaign_lifecycle_with_webhooks`` (#374) and on every
PR's CI run after the manual-approval status-emission was added in #353.
Fix:
- Add ``_to_wire_status`` in ``media_buy_list.py``. Takes any input
(``str | MediaBuyStatus | None``) and returns either a wire-valid
string from the seven-member enum, or ``None`` for values the wire
rejects. Case-insensitive on string input.
- Apply it at the manual-approval response site in
``update_media_buy.py``. ``current_status`` is now guaranteed
wire-valid (or ``None``) before reaching ``UpdateMediaBuySuccess``.
The other three response-status sites (cancel, pause/resume, final
``_compute_status`` path) already emit values from the wire enum
by construction.
Tests:
- ``TestToWireStatus`` (6 cases): wire-valid passthrough, case
insensitivity, persisted-only rejection (``draft``,
``pending_approval``), ``None``/empty/non-string handling.
- ``test_manual_approval_response_coerces_non_wire_db_status_to_none``:
end-to-end behavior — a persisted ``pending_approval`` does not leak
to the response.
- ``test_manual_approval_response_preserves_wire_valid_db_status``:
wire-valid statuses still pass through unchanged.
Verified locally:
- Failing E2E ``test_complete_campaign_lifecycle_with_webhooks`` passes
against the full Docker stack.
- ``tox -e unit`` (4314 tests) and ``tox -e integration`` (1030
update_media_buy-adjacent tests) both green.
Fixes #374.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(transport): env-gated stateless MCP mode for multi-replica deploys (#376)
The MCP Python SDK's ``StreamableHTTPSessionManager`` stores
``_server_instances`` as a process-local dict. Multi-replica deployments
without sticky LB routing on ``Mcp-Session-Id`` see ``tools/list`` and
``tools/call`` randomly 404 with "Session not found" when a request
lands on a replica that didn't handle ``initialize``.
A 10-attempt probe against the Wonderstruck deployment confirmed the
dice roll: ``initialize`` always 200 (creates session on whichever
replica answers); ``tools/list`` and ``tools/call`` with the same
session ID succeeded only when they happened to land on the same
replica (~50/50 each). Yesterday's compliance baseline (170 steps, 12
tools discovered) caught the deployment during a single-replica window;
today the same baseline rerun returned 0 tools because
``discoverAgentProfile`` calls ``initialize`` → ``tools/list`` in tight
succession, and ``tools/list`` lost the affinity coin flip half the
time.
``serve()`` has supported ``stateless_http: bool`` since adcp 5.0
(``adcp/server/serve.py:2053`` sets ``mcp.settings.stateless_http``
from the kwarg unconditionally, so ``FASTMCP_STATELESS_HTTP`` env
alone has no effect — the kwarg overrides FastMCP's reader). This
plumbs the kwarg through ``_serve_kwargs`` gated on
``ADCP_STATELESS_HTTP``:
* Unset / falsy → stateful (default). Single-replica prod, local dev,
in-process tests, and the compliance-runner storyboard sweep keep
the session-reuse perf optimization.
* ``ADCP_STATELESS_HTTP=true`` → stateless. Each request creates a
fresh transport context; multi-replica works without sticky LB.
Per the FastMCP deployment doc
(https://gofastmcp.com/v2/deployment/http): stateless mode is the
recommended pattern for horizontal scaling — cookie-based stickiness
is unreliable because most MCP clients use ``fetch()`` and drop
``Set-Cookie``. Header-based stickiness on ``Mcp-Session-Id`` would
also work (the AdCP SDK forwards the header cleanly) and would keep
session-reuse perf on prod compliance runs; this env var doesn't
preclude that — the deployment chooses by setting / unsetting
``ADCP_STATELESS_HTTP``.
Tests verify the env var maps to the kwarg correctly across true /
false / unset and case variants. Existing
``test_serve_kwargs_middleware_order.py`` extended with the new
``stateless_http``-focused cases.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(deps): bump adcp 5.2.0 → 5.3.0 (#379)
Picks up:
- SellerA2AClient for in-process A2A handler testing (#694)
- PgBuyerAgentRegistry.with_caching() factory (#692)
- v3 storyboard CI gate that actually asserts (#693)
- Sequence[T] widening on response-only list fields (#635)
- Composed lifespan preservation when public_url is callable (#680)
- ads.txt MANAGERDOMAIN fallback discovery (#704/#705)
- validate_adagents_structure helper (#708)
- webhook_signing.supported boot validator (#695)
Audited the codebase for workarounds the bump should now obsolete. One
real candidate: AgentCardPublicUrlMiddleware (190 LOC) — #680 means
transport="both" + callable public_url now works. Replacing it with a
public_url=resolver callable will land separately.
Two workarounds the bump can't eliminate, filed upstream:
- serve(lifespan=) hook missing — adcp-client-python#709
- cross-class entity overrides still need type:ignore[assignment] —
adcp-client-python#710
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#377): four-state aao_status_kind + permissive unbound resolution (#380)
Wonderstruck-class publishers ship bare ``authorized_agents`` entries
(``{url, authorized_for}`` only, no ``authorization_type``) alongside a
top-level ``properties[]`` block. The AdCP SDK's strict resolver returns
``[]`` for these, so:
- Publisher Partnerships chip rendered "Pending 0/0" — misleading
operators into thinking the publisher hadn't authorized us yet.
- Products UI used to bind anyway via a homegrown heuristic, then a
prior pass tightened it to match the SDK — regressing Wonderstruck.
This change introduces a four-state ``PublisherPartnerStatusKind``
(``authorized`` | ``unbound`` | ``pending`` | ``no_properties`` |
``unreachable``) and an explicit permissive resolution path:
- ``aao_lookup_service.get_publisher_partner_status`` uses the SDK
strictly first; falls back to ``unbound`` only when our entry is
bare and the file has top-level properties. Surfaces a conformance
hint so operators can nudge the publisher to add a typed binding.
- ``property_discovery_service._extract_properties`` mirrors the same
classification and, on the unbound branch, gates top-level
properties to those carrying a ``type=domain`` identifier matching
the publisher_domain — closes the attack vector where a publisher
could bare-list us + claim arbitrary app/podcast/DOOH bundle IDs.
- Shared shape helpers in ``src/services/_adagents_shapes.py``
(``is_bare_entry``, ``find_agent_entry``, ``top_level_properties``)
cover the full schema selector set including ``signal_ids`` /
``signal_tags``.
- New nullable ``aao_status_kind`` column on ``publisher_partners`` —
legacy NULL rows fall back to the existing derivation in
``_partner_to_dict`` so the rollout is safe under rolling deploys.
- JS chip styles for ``unbound`` ("Authorized (non-conformant file)")
and ``no_properties`` ("No properties listed").
Upstream issues filed in parallel for ecosystem alignment:
- adcontextprotocol/adcp#4478 — typed
``authorization_type: "all_top_level_properties"`` variant so
publishers have a spec-conformant shape; once shipped we can
deprecate the local permissive shim.
- adcontextprotocol/adcp-client-python#711 — permissive resolver API.
- adcontextprotocol/adcp-client#1721 — TS SDK per-agent resolution +
permissive mode for cross-SDK consistency.
Fixes #377
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(compliance): residual fixes from 7.1.0 probe — INVALID_REQUEST, INVALID_STATE, WWW-Authenticate (#383)
* fix(compliance): residual fixes from 7.1.0 probe — INVALID_REQUEST, INVALID_STATE, WWW-Authenticate
Closes three residual storyboard failures observed in the 7.1.0 comply()
re-probe against Wonderstruck after #348/#349 fixes deployed:
1. **error_compliance/nonexistent_product** — pre-dispatch validation in
``_create_media_buy_impl`` raised ``ValueError`` (past start_time, reversed
dates, etc.) and the outer ``except (ValueError, PermissionError)`` handler
emitted ``Error(code="VALIDATION_ERROR")``. ``VALIDATION_ERROR`` is not in
the AdCP 3.0 ``STANDARD_ERROR_CODES`` enum, so buyer agents walking the
enum for self-correction silently drop the error. Change wire code to
spec-canonical ``INVALID_REQUEST``. Storyboard expects
``PRODUCT_NOT_FOUND``, ``PRODUCT_UNAVAILABLE``, or ``INVALID_REQUEST``;
sibling ``reversed_dates_error`` accepts ``VALIDATION_ERROR`` or
``INVALID_REQUEST``. ``INVALID_REQUEST`` is the only value in both
sets and is the spec-canonical choice.
2. **media_buy_state_machine/pause_canceled_buy** — ``_update_media_buy_impl``
had a terminal-state guard on cancel (re-cancel raises
``AdCPNotCancellableError``) but the pause/resume branch dispatched
straight to the adapter. Spec requires rejection with
``/adcp_error/code == "INVALID_STATE"`` for pause-of-canceled.
New exception ``AdCPInvalidStateError`` (``error_code="INVALID_STATE"``,
recovery ``correctable``, 422) covers the symmetric guard. Fires
BEFORE adapter dispatch on both terminal states (``canceled``,
``completed``) for both actions (``paused=True``, ``paused=False``).
Idempotency-spec friendly: same payload yields the same wire code on
retry regardless of which adapter would have handled the transition.
3. **security_baseline/probe_unauth** — RFC 6750 §3 requires a
``WWW-Authenticate: Bearer`` header on every 401 from a Bearer-protected
resource. Upstream ``adcp.server.auth.BearerTokenAuthMiddleware`` on the
MCP leg returns 401 without the header for missing/invalid tokens; the
A2A leg and ``SigningVerifyMiddleware`` already emit it correctly.
New ``WWWAuthenticateMiddleware`` (in ``core/middleware/``) wraps the
ASGI ``send`` callable and injects the bare ``Bearer`` challenge on
401 responses missing the header. Case-insensitive presence check so
stacking is safe; no-op on 2xx / 3xx / 4xx-other / 5xx so a 403 doesn't
confuse buyers about which auth scheme to apply. Registered AFTER
``AdminWSGIMount`` so Google-OAuth-gated admin paths short-circuit
before the buyer-protocol challenge sees them.
Bundled together because they ship in a single redeploy cycle and the
PR-title-check enforces one Conventional Commit prefix per PR; the three
fixes are independent at the code level (different files, different
behavioural surfaces, different tests).
## Residuals still open (not in this PR)
- ``pagination_integrity_list_accounts/first_page`` — ``has_more`` returns
false on a 3-seeded list with ``max_results=2``. Pagination logic in
``_apply_pagination`` is correct in isolation; the storyboard's seed→list
chain isn't reaching the impl with the expected request shape. Needs
separate investigation with end-to-end repro.
- ``media_buy_seller/proposal_finalize/get_products_refine`` — refine path
on ``get_products`` returns no ``proposals[]``. ``SalesAgentProposalManager.refine_products``
raises ``UNSUPPORTED_FEATURE``. Substantial feature work, separate PR.
- ``security_baseline/assert_mechanism`` — likely fixed transitively by
the ``WWW-Authenticate`` header; re-probe after deploy will confirm.
## Verification
- ``make quality`` — 4292 passed, 14 skipped, 19 xfailed
- New targeted tests: 30/30 (15 middleware × scope cases, 6 INVALID_STATE
behavioural × class cases, 9 INVALID_REQUEST schema cases)
- Existing ``test_max_daily_spend_exceeded`` updated to expect the new
wire code per the change description
- Structural guards (transport-agnostic-impl, no-toolerror-in-impl, etc.)
pass; the new middleware is a salesagent-side ASGI wrapper, not in
``_impl`` scope
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(comply): mark WWWAuthenticateMiddleware as workaround for adcp-client-python#712
The upstream defect is in ``adcp/server/auth.py:411`` —
``BearerTokenAuthMiddleware._unauthenticated`` emits a ``JSONResponse`` with
``status_code=401`` but no ``WWW-Authenticate`` header. The sibling
``A2ABearerAuthMiddleware._send_unauthenticated`` in the same file (line
1024) gets it right. Filed at adcontextprotocol/adcp-client-python#712.
Documents the deletion plan: when the upstream fix ships and we bump
``adcp``, the middleware's case-insensitive presence check makes it a
no-op, so the order is safe — bump → re-probe → remove the middleware
and its registration in a follow-up PR.
No code change. Comments only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* review(comply): code-reviewer nits from PR #383
Fixes two factually-wrong claims and adds a regression guard, all
flagged by the code-reviewer pass:
1. ``media_buy_create.py:2418`` comment claimed "``VALIDATION_ERROR`` is
not in the spec enum and gets dropped by buyer agents walking
``STANDARD_ERROR_CODES``." It IS in the enum
(``adcp/types/generated_poc/enums/error_code.py:46``). Replace with
the actual justification — the storyboard-intersection argument —
and add a forward note about the dead ``PermissionError`` catch path
(no code inside this try raises it today; if a future principal-
ownership check moves in, split the except so PermissionError maps
to ``PERMISSION_DENIED``).
2. ``test_invalid_request_envelope_on_validation_failure.py`` carried
the same wrong claim in its module docstring. Rewrite to reflect
the actual intersection argument.
3. Add ``test_www_authenticate_runs_after_admin_mount_and_before_signing``
to ``test_serve_kwargs_middleware_order.py`` — pins ``WWWAuthenticateMiddleware``
between ``AdminWSGIMount`` (so admin Google-OAuth 401s don't get a
misleading Bearer challenge) and ``SigningVerifyMiddleware`` (so
signing-emitted 401s flow through the injector). A future refactor
that moves the middleware either direction surfaces here instead of
silently breaking RFC 6750 §3 compliance.
No behaviour change. Quality: 4293 passed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(proposal): implement v1 refine_products + flip capabilities.refine=True (#385)
* feat(proposal): implement v1 refine_products + flip capabilities.refine=True
Closes the ``media_buy_seller/proposal_finalize/get_products_refine``
storyboard failure observed in the 7.1.0 comply() probe against
Wonderstruck.
## What changed
* ``SalesAgentProposalManager.refine_products`` now has a real
implementation instead of raising ``UNSUPPORTED_FEATURE``. Delegates
to ``_get_products_impl`` for products, decorates the response with
a fresh ``Proposal`` via the existing
``_build_v1_brief_proposal`` even-split allocator, and populates
``refinement_applied[]`` from the buyer's ``refine[]`` asks.
* ``ProposalCapabilities.refine`` flipped from ``False`` to ``True``.
The framework router now dispatches ``buying_mode='refine'`` requests
to ``refine_products`` instead of falling through to ``get_products``
(which never populated ``refinement_applied``).
* New ``_build_v1_refinement_applied`` helper: dispatches each refine
entry's ``scope`` (``request`` / ``product`` / ``proposal``) to the
matching ``RefinementApplied{1,2,3}`` variant. Status is uniformly
``applied`` with a v1-acknowledgement note explaining the response
carries a fresh-but-unchanged-strategy proposal. Forward-compat:
unknown scopes and malformed entries (e.g. product-scope without
``product_id``) are silently dropped rather than crashing the
response.
## v1 vs v2 semantics
v1 is explicitly acknowledgement-shaped. Storyboard validation is
``field_present @ /proposals`` and ``response_schema`` — both satisfied
without semantic refinement. The note in each ``refinement_applied``
entry signals the v1 limitation so buyers see honest behaviour: the
proposal is fresh but the allocation hasn't been re-strategized from
the ask content. v2 will swap the even-split for an allocation that
actually honors asks (drop product / shift budget / shape targeting)
once ``ProposalStore`` is wired to load the prior draft by
``proposal_id``. The wire contract stays stable across v1/v2.
## Tests
Mirrors the pattern in ``test_proposal_manager_brief.py``:
* ``TestSalesAgentProposalManagerCapabilities`` pins
``capabilities.refine=True`` and the unchanged sales_specialism.
* ``TestBuildV1RefinementApplied`` covers every scope variant,
multi-entry ordering preservation, malformed-entry drop, unknown-
scope drop, and RootModel-wrapped entry unwrap.
* ``TestRefinementAppliedNote`` pins the buyer-facing breadcrumb so a
future content swap is intentional.
11 new tests; quality green (4273 passed, 14 skipped, 19 xfailed).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* review(refine): cap buyer-supplied echo + drop dead fallback (PR #385 nits)
Addresses three review items:
**security-reviewer L1 (Should-Fix): bound buyer-supplied refine echo.**
``RefinementApplied2.product_id`` and ``RefinementApplied3.proposal_id``
are typed ``str`` with no length cap in the adcp library, so an
adversarial buyer could ship 10MB ids and force us to hold them through
Pydantic validation and echo them back. Added two caps in
``core/proposal/manager.py``:
* ``_MAX_REFINE_ID_LEN = 256`` — per-id length cap; oversize ids are
DROPPED (not truncated — truncation corrupts id semantics for
downstream correlation). Real AdCP ids look like ``prop_abc123`` /
``prod_video_outdoor``; 256 chars leaves generous headroom.
* ``_MAX_REFINE_ENTRIES = 50`` — array length cap, slice up front so
an N-million-entry array can't drive allocation pressure even
before the per-entry loop runs.
* New ``_is_safe_id`` helper centralizes the per-id check (also catches
non-str values, defense-in-depth for callers bypassing Pydantic).
**code-reviewer nit 1: drop dead ``or getattr(req, "refine", None)``.**
``_coerce_to_request_model`` returns a ``GetProductsRequest`` Pydantic
model that always has the ``refine`` attribute (default ``None``), so
the fallback can never fire. Simplified to
``getattr(req_model, "refine", None) or []``.
**code-reviewer nit 2: drop over-promised telemetry comment.**
The dropped-scope comment claimed "missing telemetry" without actually
emitting any. Replaced with the honest framing — forward-compat for
spec additions, known v1 limitation tracked for v2 telemetry — and
matched the docstring's "silently dropped" claim.
## New tests (6)
``TestRefineEchoLengthCaps`` in ``test_proposal_manager_refine.py``:
* ``test_oversized_product_id_dropped`` — 257-char id (cap+1) dropped
* ``test_oversized_proposal_id_dropped`` — same cap on proposal scope
* ``test_empty_product_id_dropped`` — zero-length symmetry
* ``test_max_length_id_accepted`` — boundary at exactly 256 chars
* ``test_excess_array_length_truncated`` — 100-entry array → 50 echoed
* ``test_non_string_product_id_dropped`` — defense-in-depth for non-str
Quality green: 4279 passed, 14 skipped, 19 xfailed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#336): enable Add Publisher on embedded view + fix(scheduler): DetachedInstanceError (#392)
* fix(#336): enable Add Publisher on embedded view
Embedded tenants couldn't add publisher partners — the UI hid the
controls and the API 403'd direct POSTs. Without publishers there are
no AuthorizedProperty rows, which empties the property selector and
blocks Create Product on embedded tenants.
PublisherPartner is not in the model-layer guard's locked set
(embedded_tenant_guard locks only Tenant core columns, AdapterConfig,
and signing creds), so publisher-partner mutations are publisher-managed
by definition. Apply the same opt-in pattern as PR #340: pass
allow_embedded_writes=True on the four mutation routes (add / delete /
sync / refresh) and drop the redundant _reject_if_embedded helper.
Template: unhide the +Add Publisher / Refresh-all buttons and the modal;
update the "Platform-managed" banner to scope only to the agent URL
(which IS platform-managed) rather than the partner roster.
Tests: flip TestPublisherPartnershipsReadonlyOnEmbedded →
TestPublisherPartnershipsEditableOnEmbedded; add positive coverage
test_managed_tenant_can_add_publisher_partner under
TestEmbeddedViewAllowsPublisherManagedWrites. Move the api-mode JSON
envelope assertion and gate-polarity check to the OIDC enable route
(still platform-managed, api_mode=True).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(scheduler): DetachedInstanceError on multi-buy delivery batch
Production trace (2026-05-08): the daily delivery-report batch
succeeded for the first media buy with a reporting_webhook, then
raised DetachedInstanceError on media_buy.tenant for every subsequent
buy in the same batch.
Root cause: each iteration calls _get_media_buy_delivery_impl, which
opens its own ``with get_db_session()``. Because get_db_session uses
a scoped_session, the inner ``scoped.remove()`` closes the SAME
session the outer batch loop is using, detaching every MediaBuy row
loaded by MediaBuyRepository.get_all_by_statuses. The first iteration
happens to complete before the inner remove() fires; iteration 2+
hits a detached instance on the next relationship access.
Fix: eager-load MediaBuy.tenant via joinedload in the scheduler's
fetch. The tenant value is materialized into the instance state and
survives detach, so media_buy.tenant returns the cached Tenant
without lazy-loading through a closed session.
Added eager_load_tenant=True parameter on
MediaBuyRepository.get_all_by_statuses (default False so the
media_buy_status_scheduler caller — which doesn't access tenant —
doesn't pay the JOIN cost).
Regression test reproduces the production trace exactly: two media
buys with reporting_webhook configured; without the fix, only one
webhook is sent and the second iteration raises DetachedInstanceError.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(#335): regression tests for product-save validation paths
The user reported "Internal Server Error" when saving a product
without selecting a Property, expecting a validation flash instead.
PR #340 closed #335 by fixing the embedded-write 403 that the
storefront proxy was misreporting; these tests document and lock in
the post-fix contract so the bug can't return undetected.
Six new scenarios under tests/admin/test_product_creation_integration.py:
- test_add_product_without_property_returns_validation_error_not_500:
POST with name + pricing, no property selection. Asserts 200 + the
"Please select at least one property tag" flash text + no leaked
Product row.
- test_add_product_malformed_inputs_never_return_500 (parametrized):
only_name, name_and_pricing_only, invalid_pricing_rate,
invalid_property_mode, property_ids_mode_no_selection. Every case
must surface a validation response, never a raw 500.
Both tests share a new ``authenticated_admin`` fixture that uses
UserFactory (CLAUDE.md Pattern #8) — re-loads the tenant inside the
factory's session to avoid DetachedInstanceError from the
test_tenant fixture's closed session.
All 17 product-create + delivery-webhook integration tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* review: tighten scheduler regression + bound display_name + pin guard layer
Addresses code-review and security-review feedback on the three preceding
commits before merge.
Scheduler test (code-review C2): tie the regression assertion to the
actual failure mode via caplog. Without this, the test depended on
``await_count`` as a second-order signal; the new check catches
``DetachedInstanceError`` directly so a future refactor that changes
the send count for unrelated reasons can't silently mask the bug.
Guard layer consistency (code-review I4): new test
``test_publisher_partner_not_locked_at_model_layer`` exercises a
PublisherPartner write on an embedded tenant without the
``management_api_caller`` bypass. If a future change adds the model to
``embedded_tenant_guard``'s locked set, this test fails with a pointer
to remove ``allow_embedded_writes=True`` from the four
publisher_partners routes. Companion note added in
embedded_tenant_guard.py near the existing locked-table listeners.
Display-name length cap (security nit 1): add a 255-char gate on
``display_name`` in ``add_publisher_partner`` so a hostile or buggy
embedded caller can't persist multi-MB strings that later render into
admin UI / API responses.
Filed #391 for the systemic scoped_session/nested-get_db_session()
trap surfaced during scheduler triage (code-review C1). The scheduler
fix in ac3a3a7b is the right immediate patch; the underlying trap
needs its own redesign and is tracked separately.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#357): use url_for() for tenant admin links so embedded-mode mounts work (#393)
* fix(#357): use url_for() for tenant admin links so embedded-mode mounts work
The Setup Checklist (and other admin surfaces) emitted bare /tenant/<id>/...
hrefs. Under the Storefront's /storefront/salesagent mount, those resolved
against the storefront host instead of the proxied salesagent path and
returned 404 from the parent app.
Service layer (setup_checklist_service.py, dashboard_service.py,
business_activity_service.py) now builds URLs via flask.url_for() so the
emitted hrefs include SCRIPT_NAME automatically. Templates that previously
hand-prepended {{ request.script_root }} are migrated to url_for() in the
same pass for consistency with CLAUDE.md Pattern #6.
SetupChecklistService runs from two transports: Flask (admin UI) and
Starlette via adcp.server.serve() (MCP/A2A). validate_setup_complete()
fires inside _create_media_buy_impl on the non-Flask path, where url_for()
would raise RuntimeError. _build_url() catches that and returns None;
validate_setup_complete only reads task['name'], so the gate behavior is
unchanged. Added tests/unit/test_setup_checklist_no_flask_context.py to
pin this contract.
JS string interpolations (`${tenantId}/...`) are intentionally untouched
— url_for can't help with runtime IDs, and CLAUDE.md Pattern #6 already
endorses `scriptRoot + path` for that case.
One pre-existing FIXME left: tenant_settings.html /settings/raw form
posts to a route that has no handler; touched only with a clarifying
comment, not a behavior change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#357): also tolerate BuildError when url_for runs in a foreign Flask app
The tenant_management_api blueprint runs as its own Flask app
(tests/integration/test_managed_tenant_api.py:54 — a bare Flask()
with only that blueprint registered). When SetupChecklistService is
invoked from that app (via tenant_status_service → /status), url_for()
for admin-UI endpoints raises werkzeug.routing.BuildError, not
RuntimeError, because the endpoint isn't registered there.
_build_url now catches both. The management API never reads
action_url (it surfaces configure_path from a static map in
tenant_status_service._CONFIGURE_PATHS), so None is correct.
Adds TestServiceWorksInForeignFlaskApp to pin the contract — Flask
context exists, but the endpoint can't be built.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(deps): bump adcp 5.3.0 → 5.4.0; drop three local workarounds (#394)
* chore(deps): bump adcp 5.3.0 → 5.4.0; drop three local workarounds
5.4.0 ships every upstream issue I filed yesterday plus a bonus.
## Drops with the bump
| Workaround | Upstream fix |
|---|---|
| ``core/middleware/www_authenticate.py`` (74 LOC) — injected ``WWW-Authenticate: Bearer`` on 401 because the MCP-leg ``BearerTokenAuthMiddleware._unauthenticated`` returned ``JSONResponse(status_code=401)`` without the header | adcp-client-python#712 → #715: ``_unauthenticated`` now emits the header on both the MCP dispatch path AND the ASGI ``_send_unauthenticated`` path, matching what the A2A leg already did |
| ``core/idempotency._ReplayMarkingStore`` (~120 LOC + private-symbol coupling to ``_WRAPPED_FUNCTIONS`` / ``_clone_response`` / ``_resolve_call_args`` / ``_to_dict`` / ``CachedResponse``) — reimplemented the full ``IdempotencyStore.wrap`` body inline to inject ``replayed: true`` on cache hits per AdCP L1/security rule 4 | adcp-client-python#714 → #717: ``IdempotencyStore.wrap`` now does ``response["replayed"] = True`` on the cache-hit branch natively |
| ``mcp_header_name="x-adcp-auth"`` + ``mcp_bearer_prefix_required=False`` — the MCP leg accepted ONLY the custom legacy header, EXCLUDING the spec-canonical ``Authorization: Bearer``. Caused ``security_baseline/probe_api_key`` storyboard failures | adcp-client-python#720 → #721: ``Authorization: Bearer`` is always accepted; ``mcp_legacy_header_aliases=[...]`` is a purely additive opt-in for adopters with deployed legacy clients |
Net diff: -533 LOC including the obsolete test files.
## Auth shape after bump
Old (broken for spec-compliant clients):
```python
BearerTokenAuth(
validate_token=_validate_token,
mcp_header_name="x-adcp-auth",
mcp_bearer_prefix_required=False,
)
```
New (spec compliance + zero break for legacy clients):
```python
BearerTokenAuth(
validate_token=_validate_token,
mcp_legacy_header_aliases=["x-adcp-auth"],
)
```
``Authorization: Bearer <token>`` is now accepted on both legs by default
(the spec carrier per RFC 6750). The ``x-adcp-auth`` legacy header keeps
working unchanged for any early-adopter MCP client still on it. Migration
is a one-way drift with no flag day.
## Files removed
- ``core/middleware/www_authenticate.py``
- ``core/tests/test_idempotency_replay_marking.py``
- ``tests/unit/test_www_authenticate_middleware.py``
- The ``WWWAuthenticateMiddleware``-ordering test in
``tests/unit/test_serve_kwargs_middleware_order.py``
## Verification
- ``make quality``: 4294 passed, 14 skipped, 19 xfailed
- Existing ``_ReplayMarkingStore`` callers in ``get_idempotency_store()``
swapped to plain ``IdempotencyStore`` — same constructor signature,
upstream provides the injection
- ``test_serve_kwargs_middleware_order.py`` updated to drop the
``WWWAuthenticateMiddleware``-position pin (middleware no longer
exists)
- After deploy, the compliance probe's ``security_baseline/probe_api_key``
and ``assert_mechanism`` storyboard steps should flip to pass — closes
the auth-header gap we filed as bokelley/salesagent#386 (which can now
be closed as "fixed upstream")
## Closes (when deployed)
- bokelley/salesagent#386 — multi-header auth (now native via #720)
- The remaining compliance-probe residual on ``security_baseline``
(3 → 1 failure; only ``proposal_finalize/create_media_buy`` left,
tracked separately as #387)
## Doesn't pick up
- 5.4.0's ``LazyPlatformRouter.proposal_stores=`` / ``proposal_store_factory=``
(#722/#724) — this is the wiring point for bokelley/salesagent#387.
Separate PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* review(deps): switch test fixtures to new BearerTokenAuth shape (PR #394 nit)
Code-reviewer flagged that two test files still constructed
``BearerTokenAuth`` with the legacy ``mcp_header_name`` /
``mcp_bearer_prefix_required`` kwargs even though production swapped
to ``mcp_legacy_header_aliases=[...]`` in this PR's main commit. The
tests passed against 5.4.0 (back-compat shim works) but emitted
``DeprecationWarning`` and stopped mirroring the production config —
production now ACCEPTS ``Authorization: Bearer`` on the MCP leg
alongside ``x-adcp-auth``, but these test fixtures still wired the
old exclusive-header semantics.
## Changes
* ``tests/unit/test_per_leg_bearer_auth.py``: ``_production_auth`` and
``_build_mcp_app`` updated to the new shape. The inner
``BearerTokenAuthMiddleware`` construction now passes
``legacy_header_aliases=auth.resolved_mcp_legacy_aliases()`` and
``legacy_aliases_bearer_prefix_required=auth.legacy_aliases_bearer_prefix_required``
in place of the deprecated ``header_name`` / ``bearer_prefix_required``
pair. Mirrors what adcp.server.serve._wrap_mcp_with_auth does
natively against 5.4.0.
* ``tests/unit/test_agent_card_auth_scheme.py``: ``_production_auth``
same swap; module-level docstring updated to reflect that both legs
now default to ``Authorization`` and the MCP leg additively accepts
``x-adcp-auth`` for legacy adopters (not as an exclusive override).
## Verification
* 10/10 targeted tests pass
* ``make quality`` — 4294 passed, 14 skipped, 19 xfailed; warning
count dropped from 117 to 105 (the deprecation warnings are gone)
No behavior change beyond what PR #394's main commit ships. Tests now
exercise the same wire shape production uses.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(logs): strip debug instrumentation + collapse audit fan-out (#395)
Production log volume on Fly was 70%+ noise. Three categories of offender,
all in our code (not the adcp SDK):
src/core/context_manager.py
- Delete leftover ``console.print`` debug blocks: 🔍 PRE-COMMIT WEBHOOK
DEBUG (16 lines per workflow step update), 🔍 POST-COMMIT WEBHOOK DEBUG
(5 lines), and the 🚀 WEBHOOK / ⚠️ WEBHOOK SKIPPED chatter. These read
as leftover instrumentation from a past debugging session and fire on
every workflow step status change.
- Convert remaining ``console.print`` calls to ``logger.debug`` (lifecycle
events: context/step creation, object linking, webhook dispatch) or
``logger.warning`` / ``logger.exception`` (errors).
- Drop the unused ``rich.console.Console`` import and module-level
``console = Console()`` singleton.
src/core/helpers/adapter_helpers.py
- Demote ``[ADAPTER_SELECT]`` / ``[ADAPTER_CONFIG]`` from ``logger.info``
to ``logger.debug`` (10 call sites). These are tracing-grade fields,
not operational signals — they should be off in production unless
someone is actively debugging adapter selection.
src/core/audit_logger.py
- Collapse the per-detail audit fan-out: instead of one ``logger.info``
per ``details`` dict key (N+1 lines per audit event), emit a single
``"<message> | <details>"`` line. Full details still persist to
``AuditLog.details`` for structured queries — the per-line fan-out was
legible in local tail but flooded production stdout.
The ``adcp.audit`` logger name is shared with the SDK's ``LoggingAuditSink``
but the noisy emissions come from our own ``audit_logger`` in
``src/core/audit_logger.py``, not the SDK — so nothing to file upstream.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(logs): drop /mcp+/health access spam, rate-limit geo warning, fix trafficker_id log bug (#397)
Three followups from the audit pass on production fly logs.
src/core/logging_config.py
- Add ``UvicornAccessNoiseFilter`` and attach it to ``uvicorn.access`` in
both production (JSON) and development (standard) modes. The filter
drops 2xx GET/POST/HEAD/OPTIONS access lines on /mcp[/] and /health —
the two endpoints hit constantly by storefront MCP pollers and Fly's
TCP+HTTP health checks. 4xx/5xx still surface so auth failures and
server errors aren't buried. Other paths (admin UI, /a2a, /.well-known,
/mcp-debug, etc.) are unaffected. Behavioral contract pinned by 18
parametrized tests in tests/unit/test_uvicorn_access_filter.py.
src/adapters/gam/managers/targeting.py
- Rate-limit the "Could not load geo mappings file" + "Using empty geo
mappings" warnings to once per process lifetime via a module-level flag.
Each ``GAMTargetingManager`` instance fires on every adapter selection,
so the same warning flooded the log on every GAM-tenant request. The
underlying file-not-found is still tracked in #396 (it means GAM geo
targeting silently produces empty results in prod and needs a
packaging-side fix).
src/adapters/google_ad_manager.py
- Fix the "Could not auto-detect trafficker_id: User instance has no
attribute 'get'" warning. The googleads SOAP client returns a zeep
complex object — it supports __getitem__ and attribute access but NOT
``.get()``. The old code called ``current_user.get('name', 'Unknown')``
inside the success-log f-string, which raised AttributeError AFTER
``self.trafficker_id`` was already assigned. The ID was being detected
correctly all along; only the success log was failing and producing a
misleading warning on every request. Switched to ``getattr`` for the
optional ``name`` field.
Filed #396 to track the underlying production OOM kill on the iad
machine and the missing ``gam_geo_mappings.json`` packaging issue — both
infrastructure-level and out of scope here.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(proposal): wire Postgres-backed ProposalStore for create_media_buy(proposal_id=…) (#390)
* feat(proposal): wire Postgres-backed ProposalStore for create_media_buy(proposal_id=…)
Closes the proposal-lookup gap that made
``proposal_finalize/create_media_buy`` fail with
``INVALID_REQUEST: Invalid budget: 0.0``. Without a wired
:class:`ProposalStore`, the framework's ``proposal_dispatch`` had no
backing for the buyer's ``proposal_id`` and ``create_media_buy``
landed in package-derivation with zero packages.
Pieces:
- ``proposals`` table (migration ``r0s1t2u3v4w5``) — mirrors the
v1.5 ``ProposalRecord`` dataclass with multi-tenant scoping and a
partial unique on ``(account_id, media_buy_id) WHERE media_buy_id
IS NOT NULL`` for reverse-index lookups
- :class:`SalesAgentProposalStore` — implements every
:class:`adcp.decisioning.proposal_store.ProposalStore` Protocol
method (put_draft / get / commit / try_reserve_consumption /
finalize_consumption / release_consumption / mark_consumed /
discard / get_by_media_buy_id) against the new table. Atomic CAS
via ``SELECT … FOR UPDATE`` serializes parallel callers.
Cross-tenant probes collapse to ``None`` / ``PROPOSAL_NOT_FOUND``
per the Protocol's principal-enumeration defense.
- :class:`_LazyPlatformRouterWithStore` — thin subclass that adds
the ``proposal_store_for_tenant`` accessor the framework's
``proposal_dispatch`` duck-types. Upstream
:class:`LazyPlatformRouter` doesn't expose it (only the eager
:class:`PlatformRouter` does, via ``proposal_stores=``).
- Wired into ``build_router()`` — single shared store across
tenants; isolation runs inside the store on
``expected_account_id``.
v1 lifecycle compromise (documented in the store docstring): the
storyboard flow goes brief → create_media_buy WITHOUT an
intermediate finalize step, but the framework's
:meth:`try_reserve_consumption` requires the proposal to be in
``committed`` state. The store auto-commits at ``put_draft`` time
with a 7-day ``expires_at`` so the buyer flow unblocks today. The
Protocol surface is unchanged — only the internal lifecycle state
differs. When the manager declares ``finalize=True`` in v2, swap
to canonical ``draft`` + explicit commit.
Tests:
- ``tests/integration/test_proposal_store.py`` — 15 integration
tests against real Postgres covering put_draft auto-commit,
payload round-trip, refine overwrite, cross-tenant probe defense
(get + try_reserve), two-phase consumption lifecycle, atomic CAS
double-reservation rejection, reverse-index lookup with
``expected_account_id`` enforcement, idempotent release/discard
- ``tests/unit/test_lazy_router_with_proposal_store.py`` — 3 unit
tests pinning the router subclass's accessor wiring
- ``tests/unit/test_proposal_store_attributes.py`` — 2 unit tests
pinning ``is_durable=True`` (production-mode gate) and the
7-day default hold window
Refs #387
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(proposal): adopt adcp 5.4 — drop workarounds, use upstream surface
Upstream shipped both items we filed during #387:
- adcp-client-python#722 → 5.4: LazyPlatformRouter accepts
``proposal_stores=`` and ``proposal_store_factory=``. Deletes our
``_LazyPlatformRouterWithStore`` subclass.
- adcp-client-python#723 → 5.4: ``ProposalCapabilities.auto_commit_on_put_draft``
shipped option B from the issue. The framework now calls
``store.commit`` immediately after ``put_draft`` for opted-in
managers. Deletes our store-side ``state=COMMITTED`` workaround
in ``put_draft``.
Migration:
- Bump ``adcp>=5.4.0``.
- ``SalesAgentProposalManager.capabilities`` declares
``auto_commit_on_put_draft=True``; framework owns the
DRAFT → COMMITTED promotion via
``auto_commit_ttl_seconds=604800`` (7-day default, matches our
prior store-side hold window).
- ``core/main.build_router`` calls ``LazyPlatformRouter(...)`` directly
with ``proposal_store_factory=lambda _tid: shared_store``. Factory
shape over eager dict because the store is a single shared
instance — eager dict would force boot-time tenant enumeration
and miss tenants registered after boot.
- ``SalesAgentProposalStore.put_draft`` writes spec-canonical
``draft`` state with ``expires_at=None``. The ``_committed_hold``
constructor param and the 7-day default are gone — the framework's
``auto_commit_ttl_seconds`` capability owns the TTL.
Tests:
- Integration: 16 tests rewritten — put_draft asserts DRAFT (not
COMMITTED), reservation lifecycle tests use a ``_put_and_commit``
helper that mirrors the framework's auto-commit dispatch, new
``TestCommit`` class covers commit promotion + idempotency +
payload-drift rejection, new test pins that put_draft on a
COMMITTED record raises ``INTERNAL_ERROR`` per Protocol.
- Unit: deleted ``test_lazy_router_with_proposal_store.py`` (no
subclass to test); trimmed ``test_proposal_store_attributes.py``
to the durability flag only (the 7-day default belongs to the
framework now).
Refs #387
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(proposal): address review — split compound account_id, account-scoped locks, fail-closed unscoped methods
Review feedback on PR #390:
**B1 (blocker): _resolve_tenant_id_for_account returned account_id
verbatim.** SalesagentAccountStore.resolve mints ``f"{tenant_id}:{ref}"``
(``ref`` defaults to ``"default"``; storyboard runs use ``"acct_demo"``).
The framework passes ``ctx.account.id`` straight into ``put_draft``,
so every prod ``put_draft`` would FK-violate on ``proposals.tenant_id``.
Fixed: split on ``":"`` and take the prefix. New integration test
``test_put_draft_handles_compound_account_id`` regresses this — uses
the real shape the framework emits.
**Security MAJOR (×3): try_reserve / finalize / release did
SELECT FOR UPDATE then filtered account_id in Python.** Cross-tenant
probes acquired the row lock, leaking existence via timing AND
providing a DoS primitive against legitimate same-tenant operations.
Fixed: ``account_id`` moved into the WHERE clause so cross-tenant
probes never acquire the lock. Two new integration tests pin the
behavior:
- test_finalize_cross_tenant_collapses_to_internal_error
- test_release_cross_tenant_is_noop (verifies foreign tenant's
release doesn't roll back the owner's CONSUMING reservation)
**Security MAJOR (×2): discard() and mark_consumed() Protocol
signatures lack ``expected_account_id``.** Any caller obtaining a
``proposal_id`` could destroy / terminate another tenant's proposal.
Neither is called by adcp 5.4's ``proposal_dispatch`` today; fixed:
both raise ``NotImplementedError`` with an ERROR log. Future framework
versions that begin calling them surface loudly before reaching prod.
Two new tests pin the fail-closed behavior.
**MAJOR M3: _serialize_recipes silently passed dicts through.**
Violates "No quiet failures" (CLAUDE.md). Fixed: raises TypeError on
non-Pydantic input — caller has to pass typed Recipe instances.
**MINOR m3: lazy imports inside every method.** Hoisted
``ProposalRecord``, ``ProposalState``, ``AdcpError`` to module level —
no circular import; the salesagent already imports the library
at module-load time elsewhere.
**NIT n2/n3: stale temporal references.** Dropped "v1 auto-commit
workaround landed before #723 and is gone" from the store docstring
and "v1 auto-commits at put_draft time" from the Proposal model
docstring. Per CLAUDE.md: don't document the prior behavior.
**M2 partial coverage: end-to-end account_id shape test added.**
``test_put_draft_handles_compound_account_id`` exercises the realistic
``"tenant_id:default"`` shape the framework actually emits. Full
end-to-end (HTTP → proposal_dispatch → store) deferred to compliance
probe post-deploy — the unit layer pins every store-side invariant.
24 integration + unit tests pass; ``make quality`` clean (4311 tests).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* review(proposal): expires_at guard + 8 lock-in tests cherry-picked from PR #398
Two additions from @bokelley's parallel #398 work that #390 lacked:
**1. Defense-in-depth expires_at check inside try_reserve_consumption.**
Security reviewer L1 finding on #398: a buyer holding a COMMITTED
proposal past its ``expires_at`` could reserve and finalize
indefinitely. The framework's
``proposal_dispatch._hydrate_proposal_context`` checks expiry on the
get-side, but ``try_reserve_consumption`` is reachable from dispatch
paths that bypass that filter (and from adopter callers that go
straight to the store). New three-line guard inside the existing row
lock raises ``PROPOSAL_EXPIRED`` with ``recovery="correctable"``.
Mirrors upstream :class:`InMemoryProposalStore._evict_expired_locked`
but surfaces the event rather than silently deleting so audit trails
survive.
**2. mark_consumed restored as implemented Protocol method.** Earlier
fail-closed pattern was over-cautious for a Protocol method the
framework doesn't currently call. Now matches the upstream
:class:`InMemoryProposalStore.mark_consumed` shape verbatim, with a
WARNING audit log on every call so unexpected invocations are
visible. Documented Protocol-signature gap (no
``expected_account_id``) — same upstream constraint that
:meth:`discard` has; ``discard`` stays fail-closed because the user's
follow-up list didn't include it.
**Tests (9 added, 1 replaced):**
- test_reserve_past_expires_at_raises_expired (locks in #1)
- test_release_silent_no_op_on_missing
- test_release_silent_no_op_on_cross_account
- test_finalize_idempotent_on_consumed_matching_media_buy
- test_finalize_mismatched_media_buy_raises
- test_mark_consumed_promotes_to_consumed
- test_mark_consumed_idempotent_on_matching
- test_mark_consumed_mismatched_raises
- test_mark_consumed_unknown_raises_internal_error
- Replaced ``test_mark_consumed_raises_not_implemented`` with the
four ``TestMarkConsumed`` cases above
All cherry-picked from #398's test suite (locked-in shapes already
correct in #390's code per @bokelley's close comment). 32 integration
+ unit tests pass; ``make quality`` clean (4311 tests).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(types): adopt SchemaVariant for 12 cross-class schema overrides (#400)
adcp 5.4.0 #718 ships ``SchemaVariant[T]`` + a mypy plugin that
rewrites the annotation to ``Any`` for override-compat purposes,
retiring the ``# type: ignore[assignment]`` stamps adopters used
to carry on cross-class entity overrides.
The 12 sites in src/core/schemas/ all match the cross-class pattern
the marker targets:
- 4× geo_*_exclude — parent declares Geo{Country,Region,Metro,
PostalArea}ExcludeItem; we substitute the inclusion variant
- 2× creatives — parent declares CreativeAsset; we substitute
our extended Creative
- 1× deployments — parent declares Deployments; we substitute
SignalDeployment
- 1× media_buys — parent declares MediaBuy; we substitute
the GetMediaBuysMediaBuy delivery-context view
- 1× ext — parent declares ExtensionObject; we use dict
- 1× sync_creatives.creatives — parent's CreativeAsset; we
use our local CreativeAsset subclass
- 1× query_summary — parent's QuerySummary; we use our local
- 1× media_buy_deliveries / 1× creatives in delivery.py —
delivery-context views
mypy.ini gets ``adcp.types.mypy_plugin`` added to the plugins
line alongside the existing sqlalchemy + pydantic plugins.
Tradeoff (documented upstream): inside the override, mypy sees the
field as ``Any``. ``typing.cast(list[T], self.field)`` recovers
precise inference at call sites that need it. None of the touched
sites currently rely on inside-override inference at usage sites,
so no cast() is needed for this change.
make quality: 4319 passed, 14 skipped, 19 xfailed.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: drop SchedulerLifespanMiddleware, use serve(on_startup=, on_shutdown=) (#401)
adcp 5.4.0 #713 ships native lifespan hooks on ``serve(transport='both')``,
which is exactly what the middleware was hand-rolling. The middleware
intercepted ASGI ``lifespan.startup`` / ``lifespan.shutdown`` scope events
to fire scheduler start/stop coroutines because earlier SDK versions
didn't expose a user-supplied lifespan extension point.
Now they do. The SDK's ``on_startup`` / ``on_shutdown`` kwargs take the
same ``Callable[[], Awaitable[None]]`` shape that ``_start_schedulers``
and ``_stop_schedulers`` already had, so the swap is mechanical:
- Drop ``SchedulerLifespanMiddleware`` from the ``asgi_middleware`` list.
- Pass ``on_startup=[_start_schedulers]`` / ``on_shutdown=[_stop_schedulers]``
in ``_serve_kwargs()``, conditional on ``include_scheduler`` (tests still
skip).
- Delete ``core/middleware/scheduler_lifespan.py`` (61 LOC).
- Update the ``_serve_kwargs`` docstring to reference the SDK hook instead.
The middleware ran scheduler shutdown with a 10s ``asyncio.wait_for``
guard; the SDK fires hooks unguarded. Our ``_stop_schedulers`` already
caps its own awaitables (delivery + media-buy status schedulers each
join their internal task groups with a bounded timeout), so dropping
the outer wait_for is fine — it was defensive double-bookkeeping.
make quality: 4319 passed, 14 skipped, 19 xfailed.
Closes the second of three local rip-outs unlocked by the adcp 5.4.0
bump. The third (AgentCardPublicUrlMiddleware → public_url callable)
lands separately.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: swap AgentCardPublicUrlMiddleware for public_url callable (#402)
The salesagent middleware existed because earlier SDK versions either
hardcoded ``http://localhost:{port}/`` into the agent card with no
override hook (pre-5.0) or crashed ``transport='both'`` startup when
``public_url`` was a callable (5.2.0, ``AttributeError: 'function'
object has no attribute 'router'``).
adcp 5.3.0 #680 fixed the composed-lifespan crash. 5.4.0 has confirmed
the callable path works under ``transport='both'`` in production. The
SDK's ``serve(public_url=PublicUrlResolver)`` is now the right primitive
for per-request agent-card URL derivation.
## What lands
- ``core/main._resolve_public_url(request) -> str`` — pure function
with the same header-precedence rules the middleware enforced:
PUBLIC_URL env > X-Forwarded-Host > Host, X-Forwarded-Proto for
scheme, ``http://`` for loopback / ``https://`` otherwise.
- Wired as ``"public_url": _resolve_public_url`` in ``_serve_kwargs``.
- Drop the middleware from the ``asgi_middleware`` list.
- Delete ``core/middleware/agent_card_public_url.py`` (189 LOC).
- Replace ``test_agent_card_public_url_middleware.py`` with 13 tests
of the new resolver covering: X-Forwarded-Host precedence, Host
fallback, comma-chain stripping, proto override, https default,
loopback http exception (matches SDK's ``_validate_card_url``),
PUBLIC_URL env override, no-headers fallback.
- Update ``test_serve_kwargs_middleware_order`` — replace the
middleware-present assertion with a ``public_url is callable``
assertion.
## Net diff
-442 LOC (mostly the middleware + ASGI plumbing tests it required)
+195 LOC (resolver doc + resolver tests + updated order test)
= -247 LOC net.
## What stays the same in production behavior
- PUBLIC_URL env takes precedence (single-host deploys unchanged).
- X-Forwarded-Host derives multi-tenant subdomain URLs (same as before).
- X-Forwarded-Proto controls scheme.
- Loopback hosts get ``http://`` (the SDK's _validate_card_url enforces
this — non-loopback ``http`` returns 500 from the SDK).
## What's different (intentional)
- The middleware refused to rewrite non-loopback URLs (defensive
pass-through). The resolver always derives the URL afresh. This is
safer: with the static-public_url fallback gone, the resolver is the
single source of truth and there's no "what gets rewritten vs
passed through" branching to reason about.
- Response-body buffering and content-length recalculation are gone —
the SDK builds the card from the resolver's URL directly.
make quality: 4322 passed, 14 skipped, 19 xfailed.
Closes the third of three local rip-outs unlocked by the adcp 5.4.0
bump (after SchemaVariant migration and SchedulerLifespanMiddleware
removal).
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(ops): add tenant export/import for legacy → embedded migration (#403)
Reflection-based export/import for tenant-scoped data. Walks SQLAlchemy
metadata to discover all 41 tenant-scoped tables (including transitive
chains like media_packages → media_buys, strategy_states → strategies,
object_workflow_mapping → workflow_steps), then exports/imports rows in
FK-dependency order inside a single transaction.
Built for moving clients from legacy hosting to embedded mode (flip
is_embedded=True) on the same Postgres deployment. Also supports
cross-deployment moves via target-tenant-id retargeting and a
strip-secrets mode that wipes Fernet ciphertext + plaintext bearer
credentials (admin_token, slack/audit/hitl webhook URLs, GAM refresh
token, push_notification_configs auth, webhook subscription secret
hash, creative/signals agent auth_credentials, ai_config api_key).
principals.access_token is intentionally preserved so buyers'
MCP/A2A integrations keep working post-import.
Safety:
- alembic_revision pinned in the bundle; import refuses on schema mismatch
- pre-flight collision check on subdomain, virtual_host, principals.access_token
raises TenantImportCollisionError with precise message instead of opaque
IntegrityError
- strict column filtering when alembic revisions match (drops are a bug,
not noise); --allow-schema-drift downgrades to warning
- Core-level inserts bypass the embedded_tenant_guard ORM listeners; the
operator-CLI trust boundary is the equivalent privilege level
- export bundle written 0600 (contains tenant secrets)
- import writes an audit_logs row capturing operator, mode, flip-to-embedded,
target_tenant_id, row counts
- explicit rollback on any import-path failure
CLIs:
scripts/ops/export_tenant.py acme --out acme.json [--strip-secrets]
scripts/ops/import_tenant.py acme.json --mode=replace --flip-to-embedded
scripts/ops/import_tenant.py acme.json --target-tenant-id new --allow-schema-drift
scripts/ops/import_tenant.py acme.json --dry-run
19 integration tests covering discovery, round-trip, collision modes,
embedded flip, retargeting, strip-secrets (encrypted + plaintext bearer),
strict filtering, schema mismatch, audit log emission.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(admin-mount): serve /robots.txt as public Disallow / instead of 401 from A2A (#407)
Crawlers and probes hitting `GET /robots.txt` on the API host fell through
to the inner A2A app, where BearerTokenAuth 401'd them. Production logs
filled with `"GET /robots.txt HTTP/1.1" 401 Unauthorized` (plus a
paired `adcp.server.auth` JSON line per rejection), and well-behaved
crawlers got an inconsistent signal — 401 is not a stable "do not
crawl" answer.
robots.txt is a host-level resource, not a per-tenant one, so neither
Flask nor A2A is the right owner. `AdminWSGIMount` already short-
circuits an analogous static response (the apex `/` → `/signup` 302),
so colocate the robots short-circuit there:
- `GET`/`HEAD /robots.txt` → 200 `text/plain` with
`User-agent: *\nDisallow: /\n` and `cache-control: public, max-age=86400`
- non-safe methods (POST, etc.) fall through unchanged — the short
circuit only covers actual crawler probes
Tests: four scenarios in `TestAdminWSGIMountRobotsTxt` covering the
GET body+headers, HEAD-returns-no-body contract, POST falling through,
and the bug itself (the request must not reach the inner A2A app on
a non-admin host).
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(embedded-guard): inline auth-flag diagnostics in rejection error (#408)
Surfaces session/connection auth-flag state directly in the
EmbeddedTenantWriteError message so SyncJob.error_message (and the
status widget that renders it) shows exactly why the guard fired
without log-diving. Distinguishes the three failure modes:
- session_present=False → object was detached at flush time
- session_flags=…
* feat(#352): emit proposals[] on get_products when buying_mode=brief (#361)
The ``media_buy_seller/proposal_finalize/get_products_brief`` storyboard
asserts that ``get_products`` calls with ``buying_mode='brief'`` return
a ``proposals[]`` array carrying at least one ``Proposal`` with a
``proposal_id`` buyers can echo into ``create_media_buy(proposal_id=...)``
to execute the bundle. Pre-PR the proposal manager forwarded directly to
``_get_products_impl`` and never emitted ``proposals``.
v1 strategy: split budget evenly across every product the publisher
returned. Each ``ProductAllocation`` references a real ``product_id``
and ``pricing_option_id`` from the response, percentages sum to exactly
100 (compensate for ``100/3`` non-termination on the final allocation
rather than 99.99-rounded), and the proposal gets a fresh
``proposal_id`` per call.
Only ``buying_mode='brief'`` triggers the proposal — wholesale and
refine opt out per spec. Empty product list short-circuits to no
proposal (the spec model requires ``min_length=1`` on allocations).
Future allocation strategies (weighted, refine-loaded drafts) plug into
the same ``_build_v1_brief_proposal`` seam without touching the manager.
## Verified
* Storyboard ``media_buy_seller/proposal_finalize/get_products_brief``:
PASS — every assertion green including
``field_present @ /proposals[0]/proposal_id``.
* 10 new unit tests in ``test_proposal_manager_brief.py`` pin builder
invariants (sum=100 across 1/2/3-product splits, unique proposal_id
per call, RootModel unwrapping, optional pricing_option_id).
* Full unit suite: 4295 passed.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#338): use url_for and relabel button to "Advertisers" on webhooks page (#367)
The "Manage Webhooks" button in templates/webhooks.html had two bugs:
1. The link used `{{ script_name }}` but the route that renders the page
(`operations.py:710`) does not pass `script_name`. In embedded iframe
context that template variable is undefined, so the link resolved to
an unprefixed path and 404'd.
2. The destination is the per-tenant principals list, not a webhook
management page — webhooks are per-principal. The label "Manage
Webhooks" was misleading.
Use `url_for()` so script-root resolution is automatic, and relabel the
button to "Advertisers" with a users icon to match its actual target.
The user reaches webhook management by clicking into an advertiser.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#362): hide AXE Set Key controls on embedded tenants (#368)
`AdapterConfig` is platform-managed on embedded tenants (only
`gam_sandbox_advertiser_id` is in `PUBLISHER_WRITABLE_FIELDS`), and
the `/settings/adapter` POST is intentionally not opted into
`allow_embedded_writes`. The Targeting Criteria Browser still rendered
the three "Set Include/Exclude/Macro Key" buttons + dropdowns + manual
entry, so clicking them returned 403 and the toast read "Failed to
save include key configuration."
Hide the editor block on embedded tenants and replace it with a
"Managed by platform" notice that points users at the upstream Tenant
Management API. The targeting-key browsing/preview UI below the card
stays visible — operators may want to look up keys when authoring
products.
Null-guard `populateAxeDropdowns` and `updateAxeKeyStatus` against the
now-absent select/status elements so the page JS doesn't throw when
the card body renders the alert variant.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(embedded): opt remaining read-only POST probes into embedded-write gate (#372)
Follow-up sweep to #365. The embedded-write gate keys off HTTP verb, so
every POST under `require_tenant_access` without `allow_embedded_writes=True`
returns 403 `embedded_writes_not_permitted` on embedded tenants — even
when the handler is a read-only probe that never touches the DB.
#365 fixed the two AI/Logfire probes called out in Laure's bug report.
Sweep covers the rest of the same class:
- `tenants.test_slack` — sends a test webhook, never writes
- `adapters.test_freewheel_connection` — validates OAuth client_credentials
against FreeWheel; reads AdapterConfig fallback secret, never writes
- `adapters.test_triton_connection` — validates JWT login against Triton;
reads AdapterConfig fallback secret, never writes
- `adapters.test_broadstreet_connection` — validates API key against
Broadstreet network endpoint, never writes
- `settings.test_domain_access` — looks up tenant access for an email
and flashes the result, never writes
Each handler was inspected to confirm zero DB writes before adding the
opt-in. The model-layer guard in `embedded_tenant_guard.py` remains in
force as defense-in-depth — any accidental Tenant/AdapterConfig write
from these paths would still be caught at commit time.
Longer-term: the verb-based gate misclassifying probes is a design
smell. A `probe=True` decorator argument that the gate honors would be
more durable than per-route opt-in. Filing as a follow-up — out of
scope for this sweep.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#365): allow AI/Logfire test-connection probes on embedded tenants (#369)
The "Test Connection" action for AI providers and Logfire on the
Integrations tab failed with "Failed: embedded_writes_not_permitted"
on embedded tenants, because the verb-based embedded-write gate
classifies any POST under `require_tenant_access` as a mutation.
`test_ai_connection` and `test_logfire_connection` are read-only
probes — they validate credentials against the upstream provider and
never write tenant state. Opt them into `allow_embedded_writes=True`;
the model-layer guard in `embedded_tenant_guard.py` remains in force
as defense-in-depth.
Also fix the test-result handlers in `templates/tenant_settings.html`
to render `data.message || data.error` instead of `data.error` alone.
Gate envelopes (and any future role-gate rejections) return both a
stable code in `error` and a human-readable string in `message`; the
old code surfaced the stable code, which read as gibberish to users.
Sweep finding (left as follow-up): the same verb-based-gate trap
exists on `tenants.test_slack`, `adapters.test_freewheel_connection`,
`adapters.test_triton_connection`, `adapters.test_broadstreet_connection`,
and `settings.test_domain_access`. Each is a read-only probe that
could opt in with the same flag.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#363): unblock Policies & Workflows writes on embedded tenants (#370)
On embedded tenants every field in the Policies & Workflows tab
(Brand Manifest Policy, Naming Conventions, Approval Workflows,
Measurement Providers, Product Ranking, Auto-approval thresholds)
silently reverted on save. Two compounding bugs:
1. Route blocked at the boundary. `/settings/business-rules` POST
used `@require_tenant_access(role=("admin",))` without
`allow_embedded_writes=True`, so the verb-based gate returned
403 `embedded_writes_not_permitted` before the handler ran.
2. JS treated the 403 HTML error page as success. `saveBusinessRules`
in `tenant_settings.js` content-type-branched: any HTML response
with no `.flash-messages` container fell through to
`window.location.reload()`. Flask's default 403 error page has no
flash messages → reload-as-success → user sees their fields revert
with no error. Affected every 4xx/5xx on that route.
Fix three layers:
- Add `allow_embedded_writes=True` to `update_business_rules`. Per
Sprint 5 design (`docs/design/embedded-mode-sprint-5.md` §"Pattern:
shared business logic with the UI"), business rules are
publisher-managed and edited via the proxied admin UI; the
management API exposes the same writes.
- Add the per-column business-rules surface to
`PUBLISHER_WRITABLE_FIELDS[Tenant]` (13 fields covering naming
templates, approval mode, creative review settings, AI policy,
advertising policy, brand manifest policy, product ranking prompt,
human review flag). Platform-identity columns (name, billing_plan,
is_active, subdomain, external_*) stay locked.
- Add `gam_manual_approval_required` / `mock_manual_approval_required`
to `PUBLISHER_WRITABLE_FIELDS[AdapterConfig]` — these mirror
`tenant.human_review_required` onto adapter config and are written
by the same handler.
- Restructure `saveBusinessRules` to check `response.ok` BEFORE
content-type branching. Non-2xx responses now surface the error
(parsing flash messages from HTML when available, falling back to
the status code) instead of silently reloading.
Added four guard tests in `test_managed_tenant_api.py::TestWriteGuard`:
business-rules columns write, manual-approval adapter columns write,
platform-identity columns stay blocked, and an end-to-end check via the
mock adapter sync field.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#364): explain empty Allowed Principals dropdown on embedded tenants (#371)
* fix(#364): explain empty Allowed Principals dropdown on embedded tenants
On embedded tenants the "Allowed Principals (Advertisers)" multi-select
on Create Product rendered only "No principals configured" — a dead
end. The Buyer Agents section in Settings hides the "Add Buyer Agent"
button on embedded tenants (this is correct: Principal provisioning is
platform-managed via the Tenant Management API), so publishers had no
path to populate the dropdown.
Two compounding things made the UI misleading:
1. The empty-state placeholder didn't distinguish embedded from open
instances. Publishers saw the same "No principals configured" text
that suggests they can fix it themselves.
2. Comments in `tenant_settings.html` and `buyer_advertiser_routing.py`
claimed Principals are "auto-created on first request by the
embedded-mode auth bypass, which reads X-Identity-Buyer-Principal-Id".
That mechanism does not exist — grep `src/` for the header returns
zero matches. Anyone tracing the empty dropdown ran into a dead-end
comment that confidently pointed at a code path that isn't there.
Fix:
- In `add_product.html` and `add_product_gam.html`, replace the
disabled `<option>No principals configured</option>` with a
context-aware empty state. Embedded tenants get an explainer that
Principals are provisioned by the platform via the Tenant Management
API; open instances get a pointer to Settings → Buyer Agents.
- Rewrite the misleading comment block in `tenant_settings.html`
around the advertisers section and the user-visible "auto-created
from request headers" line — state plainly that embedded Principal
provisioning goes through the platform API.
- Fix the matching dead-pointer comment in
`buyer_advertiser_routing.py` near the access-grant logic.
Option B (platform-managed) per `docs/design/embedded-mode-sprint-5.md`
contract. Option A (re-enable UI authoring) would have been a
write-guard expansion that contradicts the existing
`{% if not embedded_view %}` gate on "Add Buyer Agent" — and the model
guard doesn't list Principal at all, so it's the UI gate alone holding
the line. Not the right place to flip the contract.
Terminology cleanup ("Allowed Principals" vs "Buyer Agents" vs
"Advertisers") is deliberately left for a follow-up issue — that's a
larger UX project than a bug fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(#364): update assertions to match corrected embedded-mode copy
The original test asserted on the misleading "auto-created from request
headers" copy that #364 removed (because the auto-create mechanism does
not exist — see #364 PR description). Update the assertions to match
the new, accurate copy that explains platform-API provisioning.
Also refresh the class docstring to drop the same misleading claim about
header-based auto-creation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#374): coerce update_media_buy status to wire enum at response boundary (#375)
The manual-approval path on ``update_media_buy`` read ``MediaBuy.status``
straight from the DB column and surfaced it on the
``UpdateMediaBuySuccess`` response. The persisted column accepts a
broader set than the AdCP wire enum — ``draft`` (model default) and
``pending_approval`` (manual-approval create path) are both valid in
storage but not in ``MediaBuyStatus``. fastmcp's request-/response-side
Pydantic validation rejected the response with
``INVALID_REQUEST[status]: Input should be 'pending_creatives',
'pending_start', 'active', 'paused', 'completed', 'rejected' or
'canceled'``, which surfaced as an E2E failure on
``test_complete_campaign_lifecycle_with_webhooks`` (#374) and on every
PR's CI run after the manual-approval status-emission was added in #353.
Fix:
- Add ``_to_wire_status`` in ``media_buy_list.py``. Takes any input
(``str | MediaBuyStatus | None``) and returns either a wire-valid
string from the seven-member enum, or ``None`` for values the wire
rejects. Case-insensitive on string input.
- Apply it at the manual-approval response site in
``update_media_buy.py``. ``current_status`` is now guaranteed
wire-valid (or ``None``) before reaching ``UpdateMediaBuySuccess``.
The other three response-status sites (cancel, pause/resume, final
``_compute_status`` path) already emit values from the wire enum
by construction.
Tests:
- ``TestToWireStatus`` (6 cases): wire-valid passthrough, case
insensitivity, persisted-only rejection (``draft``,
``pending_approval``), ``None``/empty/non-string handling.
- ``test_manual_approval_response_coerces_non_wire_db_status_to_none``:
end-to-end behavior — a persisted ``pending_approval`` does not leak
to the response.
- ``test_manual_approval_response_preserves_wire_valid_db_status``:
wire-valid statuses still pass through unchanged.
Verified locally:
- Failing E2E ``test_complete_campaign_lifecycle_with_webhooks`` passes
against the full Docker stack.
- ``tox -e unit`` (4314 tests) and ``tox -e integration`` (1030
update_media_buy-adjacent tests) both green.
Fixes #374.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(transport): env-gated stateless MCP mode for multi-replica deploys (#376)
The MCP Python SDK's ``StreamableHTTPSessionManager`` stores
``_server_instances`` as a process-local dict. Multi-replica deployments
without sticky LB routing on ``Mcp-Session-Id`` see ``tools/list`` and
``tools/call`` randomly 404 with "Session not found" when a request
lands on a replica that didn't handle ``initialize``.
A 10-attempt probe against the Wonderstruck deployment confirmed the
dice roll: ``initialize`` always 200 (creates session on whichever
replica answers); ``tools/list`` and ``tools/call`` with the same
session ID succeeded only when they happened to land on the same
replica (~50/50 each). Yesterday's compliance baseline (170 steps, 12
tools discovered) caught the deployment during a single-replica window;
today the same baseline rerun returned 0 tools because
``discoverAgentProfile`` calls ``initialize`` → ``tools/list`` in tight
succession, and ``tools/list`` lost the affinity coin flip half the
time.
``serve()`` has supported ``stateless_http: bool`` since adcp 5.0
(``adcp/server/serve.py:2053`` sets ``mcp.settings.stateless_http``
from the kwarg unconditionally, so ``FASTMCP_STATELESS_HTTP`` env
alone has no effect — the kwarg overrides FastMCP's reader). This
plumbs the kwarg through ``_serve_kwargs`` gated on
``ADCP_STATELESS_HTTP``:
* Unset / falsy → stateful (default). Single-replica prod, local dev,
in-process tests, and the compliance-runner storyboard sweep keep
the session-reuse perf optimization.
* ``ADCP_STATELESS_HTTP=true`` → stateless. Each request creates a
fresh transport context; multi-replica works without sticky LB.
Per the FastMCP deployment doc
(https://gofastmcp.com/v2/deployment/http): stateless mode is the
recommended pattern for horizontal scaling — cookie-based stickiness
is unreliable because most MCP clients use ``fetch()`` and drop
``Set-Cookie``. Header-based stickiness on ``Mcp-Session-Id`` would
also work (the AdCP SDK forwards the header cleanly) and would keep
session-reuse perf on prod compliance runs; this env var doesn't
preclude that — the deployment chooses by setting / unsetting
``ADCP_STATELESS_HTTP``.
Tests verify the env var maps to the kwarg correctly across true /
false / unset and case variants. Existing
``test_serve_kwargs_middleware_order.py`` extended with the new
``stateless_http``-focused cases.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(deps): bump adcp 5.2.0 → 5.3.0 (#379)
Picks up:
- SellerA2AClient for in-process A2A handler testing (#694)
- PgBuyerAgentRegistry.with_caching() factory (#692)
- v3 storyboard CI gate that actually asserts (#693)
- Sequence[T] widening on response-only list fields (#635)
- Composed lifespan preservation when public_url is callable (#680)
- ads.txt MANAGERDOMAIN fallback discovery (#704/#705)
- validate_adagents_structure helper (#708)
- webhook_signing.supported boot validator (#695)
Audited the codebase for workarounds the bump should now obsolete. One
real candidate: AgentCardPublicUrlMiddleware (190 LOC) — #680 means
transport="both" + callable public_url now works. Replacing it with a
public_url=resolver callable will land separately.
Two workarounds the bump can't eliminate, filed upstream:
- serve(lifespan=) hook missing — adcp-client-python#709
- cross-class entity overrides still need type:ignore[assignment] —
adcp-client-python#710
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#377): four-state aao_status_kind + permissive unbound resolution (#380)
Wonderstruck-class publishers ship bare ``authorized_agents`` entries
(``{url, authorized_for}`` only, no ``authorization_type``) alongside a
top-level ``properties[]`` block. The AdCP SDK's strict resolver returns
``[]`` for these, so:
- Publisher Partnerships chip rendered "Pending 0/0" — misleading
operators into thinking the publisher hadn't authorized us yet.
- Products UI used to bind anyway via a homegrown heuristic, then a
prior pass tightened it to match the SDK — regressing Wonderstruck.
This change introduces a four-state ``PublisherPartnerStatusKind``
(``authorized`` | ``unbound`` | ``pending`` | ``no_properties`` |
``unreachable``) and an explicit permissive resolution path:
- ``aao_lookup_service.get_publisher_partner_status`` uses the SDK
strictly first; falls back to ``unbound`` only when our entry is
bare and the file has top-level properties. Surfaces a conformance
hint so operators can nudge the publisher to add a typed binding.
- ``property_discovery_service._extract_properties`` mirrors the same
classification and, on the unbound branch, gates top-level
properties to those carrying a ``type=domain`` identifier matching
the publisher_domain — closes the attack vector where a publisher
could bare-list us + claim arbitrary app/podcast/DOOH bundle IDs.
- Shared shape helpers in ``src/services/_adagents_shapes.py``
(``is_bare_entry``, ``find_agent_entry``, ``top_level_properties``)
cover the full schema selector set including ``signal_ids`` /
``signal_tags``.
- New nullable ``aao_status_kind`` column on ``publisher_partners`` —
legacy NULL rows fall back to the existing derivation in
``_partner_to_dict`` so the rollout is safe under rolling deploys.
- JS chip styles for ``unbound`` ("Authorized (non-conformant file)")
and ``no_properties`` ("No properties listed").
Upstream issues filed in parallel for ecosystem alignment:
- adcontextprotocol/adcp#4478 — typed
``authorization_type: "all_top_level_properties"`` variant so
publishers have a spec-conformant shape; once shipped we can
deprecate the local permissive shim.
- adcontextprotocol/adcp-client-python#711 — permissive resolver API.
- adcontextprotocol/adcp-client#1721 — TS SDK per-agent resolution +
permissive mode for cross-SDK consistency.
Fixes #377
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(compliance): residual fixes from 7.1.0 probe — INVALID_REQUEST, INVALID_STATE, WWW-Authenticate (#383)
* fix(compliance): residual fixes from 7.1.0 probe — INVALID_REQUEST, INVALID_STATE, WWW-Authenticate
Closes three residual storyboard failures observed in the 7.1.0 comply()
re-probe against Wonderstruck after #348/#349 fixes deployed:
1. **error_compliance/nonexistent_product** — pre-dispatch validation in
``_create_media_buy_impl`` raised ``ValueError`` (past start_time, reversed
dates, etc.) and the outer ``except (ValueError, PermissionError)`` handler
emitted ``Error(code="VALIDATION_ERROR")``. ``VALIDATION_ERROR`` is not in
the AdCP 3.0 ``STANDARD_ERROR_CODES`` enum, so buyer agents walking the
enum for self-correction silently drop the error. Change wire code to
spec-canonical ``INVALID_REQUEST``. Storyboard expects
``PRODUCT_NOT_FOUND``, ``PRODUCT_UNAVAILABLE``, or ``INVALID_REQUEST``;
sibling ``reversed_dates_error`` accepts ``VALIDATION_ERROR`` or
``INVALID_REQUEST``. ``INVALID_REQUEST`` is the only value in both
sets and is the spec-canonical choice.
2. **media_buy_state_machine/pause_canceled_buy** — ``_update_media_buy_impl``
had a terminal-state guard on cancel (re-cancel raises
``AdCPNotCancellableError``) but the pause/resume branch dispatched
straight to the adapter. Spec requires rejection with
``/adcp_error/code == "INVALID_STATE"`` for pause-of-canceled.
New exception ``AdCPInvalidStateError`` (``error_code="INVALID_STATE"``,
recovery ``correctable``, 422) covers the symmetric guard. Fires
BEFORE adapter dispatch on both terminal states (``canceled``,
``completed``) for both actions (``paused=True``, ``paused=False``).
Idempotency-spec friendly: same payload yields the same wire code on
retry regardless of which adapter would have handled the transition.
3. **security_baseline/probe_unauth** — RFC 6750 §3 requires a
``WWW-Authenticate: Bearer`` header on every 401 from a Bearer-protected
resource. Upstream ``adcp.server.auth.BearerTokenAuthMiddleware`` on the
MCP leg returns 401 without the header for missing/invalid tokens; the
A2A leg and ``SigningVerifyMiddleware`` already emit it correctly.
New ``WWWAuthenticateMiddleware`` (in ``core/middleware/``) wraps the
ASGI ``send`` callable and injects the bare ``Bearer`` challenge on
401 responses missing the header. Case-insensitive presence check so
stacking is safe; no-op on 2xx / 3xx / 4xx-other / 5xx so a 403 doesn't
confuse buyers about which auth scheme to apply. Registered AFTER
``AdminWSGIMount`` so Google-OAuth-gated admin paths short-circuit
before the buyer-protocol challenge sees them.
Bundled together because they ship in a single redeploy cycle and the
PR-title-check enforces one Conventional Commit prefix per PR; the three
fixes are independent at the code level (different files, different
behavioural surfaces, different tests).
## Residuals still open (not in this PR)
- ``pagination_integrity_list_accounts/first_page`` — ``has_more`` returns
false on a 3-seeded list with ``max_results=2``. Pagination logic in
``_apply_pagination`` is correct in isolation; the storyboard's seed→list
chain isn't reaching the impl with the expected request shape. Needs
separate investigation with end-to-end repro.
- ``media_buy_seller/proposal_finalize/get_products_refine`` — refine path
on ``get_products`` returns no ``proposals[]``. ``SalesAgentProposalManager.refine_products``
raises ``UNSUPPORTED_FEATURE``. Substantial feature work, separate PR.
- ``security_baseline/assert_mechanism`` — likely fixed transitively by
the ``WWW-Authenticate`` header; re-probe after deploy will confirm.
## Verification
- ``make quality`` — 4292 passed, 14 skipped, 19 xfailed
- New targeted tests: 30/30 (15 middleware × scope cases, 6 INVALID_STATE
behavioural × class cases, 9 INVALID_REQUEST schema cases)
- Existing ``test_max_daily_spend_exceeded`` updated to expect the new
wire code per the change description
- Structural guards (transport-agnostic-impl, no-toolerror-in-impl, etc.)
pass; the new middleware is a salesagent-side ASGI wrapper, not in
``_impl`` scope
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(comply): mark WWWAuthenticateMiddleware as workaround for adcp-client-python#712
The upstream defect is in ``adcp/server/auth.py:411`` —
``BearerTokenAuthMiddleware._unauthenticated`` emits a ``JSONResponse`` with
``status_code=401`` but no ``WWW-Authenticate`` header. The sibling
``A2ABearerAuthMiddleware._send_unauthenticated`` in the same file (line
1024) gets it right. Filed at adcontextprotocol/adcp-client-python#712.
Documents the deletion plan: when the upstream fix ships and we bump
``adcp``, the middleware's case-insensitive presence check makes it a
no-op, so the order is safe — bump → re-probe → remove the middleware
and its registration in a follow-up PR.
No code change. Comments only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* review(comply): code-reviewer nits from PR #383
Fixes two factually-wrong claims and adds a regression guard, all
flagged by the code-reviewer pass:
1. ``media_buy_create.py:2418`` comment claimed "``VALIDATION_ERROR`` is
not in the spec enum and gets dropped by buyer agents walking
``STANDARD_ERROR_CODES``." It IS in the enum
(``adcp/types/generated_poc/enums/error_code.py:46``). Replace with
the actual justification — the storyboard-intersection argument —
and add a forward note about the dead ``PermissionError`` catch path
(no code inside this try raises it today; if a future principal-
ownership check moves in, split the except so PermissionError maps
to ``PERMISSION_DENIED``).
2. ``test_invalid_request_envelope_on_validation_failure.py`` carried
the same wrong claim in its module docstring. Rewrite to reflect
the actual intersection argument.
3. Add ``test_www_authenticate_runs_after_admin_mount_and_before_signing``
to ``test_serve_kwargs_middleware_order.py`` — pins ``WWWAuthenticateMiddleware``
between ``AdminWSGIMount`` (so admin Google-OAuth 401s don't get a
misleading Bearer challenge) and ``SigningVerifyMiddleware`` (so
signing-emitted 401s flow through the injector). A future refactor
that moves the middleware either direction surfaces here instead of
silently breaking RFC 6750 §3 compliance.
No behaviour change. Quality: 4293 passed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(proposal): implement v1 refine_products + flip capabilities.refine=True (#385)
* feat(proposal): implement v1 refine_products + flip capabilities.refine=True
Closes the ``media_buy_seller/proposal_finalize/get_products_refine``
storyboard failure observed in the 7.1.0 comply() probe against
Wonderstruck.
## What changed
* ``SalesAgentProposalManager.refine_products`` now has a real
implementation instead of raising ``UNSUPPORTED_FEATURE``. Delegates
to ``_get_products_impl`` for products, decorates the response with
a fresh ``Proposal`` via the existing
``_build_v1_brief_proposal`` even-split allocator, and populates
``refinement_applied[]`` from the buyer's ``refine[]`` asks.
* ``ProposalCapabilities.refine`` flipped from ``False`` to ``True``.
The framework router now dispatches ``buying_mode='refine'`` requests
to ``refine_products`` instead of falling through to ``get_products``
(which never populated ``refinement_applied``).
* New ``_build_v1_refinement_applied`` helper: dispatches each refine
entry's ``scope`` (``request`` / ``product`` / ``proposal``) to the
matching ``RefinementApplied{1,2,3}`` variant. Status is uniformly
``applied`` with a v1-acknowledgement note explaining the response
carries a fresh-but-unchanged-strategy proposal. Forward-compat:
unknown scopes and malformed entries (e.g. product-scope without
``product_id``) are silently dropped rather than crashing the
response.
## v1 vs v2 semantics
v1 is explicitly acknowledgement-shaped. Storyboard validation is
``field_present @ /proposals`` and ``response_schema`` — both satisfied
without semantic refinement. The note in each ``refinement_applied``
entry signals the v1 limitation so buyers see honest behaviour: the
proposal is fresh but the allocation hasn't been re-strategized from
the ask content. v2 will swap the even-split for an allocation that
actually honors asks (drop product / shift budget / shape targeting)
once ``ProposalStore`` is wired to load the prior draft by
``proposal_id``. The wire contract stays stable across v1/v2.
## Tests
Mirrors the pattern in ``test_proposal_manager_brief.py``:
* ``TestSalesAgentProposalManagerCapabilities`` pins
``capabilities.refine=True`` and the unchanged sales_specialism.
* ``TestBuildV1RefinementApplied`` covers every scope variant,
multi-entry ordering preservation, malformed-entry drop, unknown-
scope drop, and RootModel-wrapped entry unwrap.
* ``TestRefinementAppliedNote`` pins the buyer-facing breadcrumb so a
future content swap is intentional.
11 new tests; quality green (4273 passed, 14 skipped, 19 xfailed).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* review(refine): cap buyer-supplied echo + drop dead fallback (PR #385 nits)
Addresses three review items:
**security-reviewer L1 (Should-Fix): bound buyer-supplied refine echo.**
``RefinementApplied2.product_id`` and ``RefinementApplied3.proposal_id``
are typed ``str`` with no length cap in the adcp library, so an
adversarial buyer could ship 10MB ids and force us to hold them through
Pydantic validation and echo them back. Added two caps in
``core/proposal/manager.py``:
* ``_MAX_REFINE_ID_LEN = 256`` — per-id length cap; oversize ids are
DROPPED (not truncated — truncation corrupts id semantics for
downstream correlation). Real AdCP ids look like ``prop_abc123`` /
``prod_video_outdoor``; 256 chars leaves generous headroom.
* ``_MAX_REFINE_ENTRIES = 50`` — array length cap, slice up front so
an N-million-entry array can't drive allocation pressure even
before the per-entry loop runs.
* New ``_is_safe_id`` helper centralizes the per-id check (also catches
non-str values, defense-in-depth for callers bypassing Pydantic).
**code-reviewer nit 1: drop dead ``or getattr(req, "refine", None)``.**
``_coerce_to_request_model`` returns a ``GetProductsRequest`` Pydantic
model that always has the ``refine`` attribute (default ``None``), so
the fallback can never fire. Simplified to
``getattr(req_model, "refine", None) or []``.
**code-reviewer nit 2: drop over-promised telemetry comment.**
The dropped-scope comment claimed "missing telemetry" without actually
emitting any. Replaced with the honest framing — forward-compat for
spec additions, known v1 limitation tracked for v2 telemetry — and
matched the docstring's "silently dropped" claim.
## New tests (6)
``TestRefineEchoLengthCaps`` in ``test_proposal_manager_refine.py``:
* ``test_oversized_product_id_dropped`` — 257-char id (cap+1) dropped
* ``test_oversized_proposal_id_dropped`` — same cap on proposal scope
* ``test_empty_product_id_dropped`` — zero-length symmetry
* ``test_max_length_id_accepted`` — boundary at exactly 256 chars
* ``test_excess_array_length_truncated`` — 100-entry array → 50 echoed
* ``test_non_string_product_id_dropped`` — defense-in-depth for non-str
Quality green: 4279 passed, 14 skipped, 19 xfailed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#336): enable Add Publisher on embedded view + fix(scheduler): DetachedInstanceError (#392)
* fix(#336): enable Add Publisher on embedded view
Embedded tenants couldn't add publisher partners — the UI hid the
controls and the API 403'd direct POSTs. Without publishers there are
no AuthorizedProperty rows, which empties the property selector and
blocks Create Product on embedded tenants.
PublisherPartner is not in the model-layer guard's locked set
(embedded_tenant_guard locks only Tenant core columns, AdapterConfig,
and signing creds), so publisher-partner mutations are publisher-managed
by definition. Apply the same opt-in pattern as PR #340: pass
allow_embedded_writes=True on the four mutation routes (add / delete /
sync / refresh) and drop the redundant _reject_if_embedded helper.
Template: unhide the +Add Publisher / Refresh-all buttons and the modal;
update the "Platform-managed" banner to scope only to the agent URL
(which IS platform-managed) rather than the partner roster.
Tests: flip TestPublisherPartnershipsReadonlyOnEmbedded →
TestPublisherPartnershipsEditableOnEmbedded; add positive coverage
test_managed_tenant_can_add_publisher_partner under
TestEmbeddedViewAllowsPublisherManagedWrites. Move the api-mode JSON
envelope assertion and gate-polarity check to the OIDC enable route
(still platform-managed, api_mode=True).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(scheduler): DetachedInstanceError on multi-buy delivery batch
Production trace (2026-05-08): the daily delivery-report batch
succeeded for the first media buy with a reporting_webhook, then
raised DetachedInstanceError on media_buy.tenant for every subsequent
buy in the same batch.
Root cause: each iteration calls _get_media_buy_delivery_impl, which
opens its own ``with get_db_session()``. Because get_db_session uses
a scoped_session, the inner ``scoped.remove()`` closes the SAME
session the outer batch loop is using, detaching every MediaBuy row
loaded by MediaBuyRepository.get_all_by_statuses. The first iteration
happens to complete before the inner remove() fires; iteration 2+
hits a detached instance on the next relationship access.
Fix: eager-load MediaBuy.tenant via joinedload in the scheduler's
fetch. The tenant value is materialized into the instance state and
survives detach, so media_buy.tenant returns the cached Tenant
without lazy-loading through a closed session.
Added eager_load_tenant=True parameter on
MediaBuyRepository.get_all_by_statuses (default False so the
media_buy_status_scheduler caller — which doesn't access tenant —
doesn't pay the JOIN cost).
Regression test reproduces the production trace exactly: two media
buys with reporting_webhook configured; without the fix, only one
webhook is sent and the second iteration raises DetachedInstanceError.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(#335): regression tests for product-save validation paths
The user reported "Internal Server Error" when saving a product
without selecting a Property, expecting a validation flash instead.
PR #340 closed #335 by fixing the embedded-write 403 that the
storefront proxy was misreporting; these tests document and lock in
the post-fix contract so the bug can't return undetected.
Six new scenarios under tests/admin/test_product_creation_integration.py:
- test_add_product_without_property_returns_validation_error_not_500:
POST with name + pricing, no property selection. Asserts 200 + the
"Please select at least one property tag" flash text + no leaked
Product row.
- test_add_product_malformed_inputs_never_return_500 (parametrized):
only_name, name_and_pricing_only, invalid_pricing_rate,
invalid_property_mode, property_ids_mode_no_selection. Every case
must surface a validation response, never a raw 500.
Both tests share a new ``authenticated_admin`` fixture that uses
UserFactory (CLAUDE.md Pattern #8) — re-loads the tenant inside the
factory's session to avoid DetachedInstanceError from the
test_tenant fixture's closed session.
All 17 product-create + delivery-webhook integration tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* review: tighten scheduler regression + bound display_name + pin guard layer
Addresses code-review and security-review feedback on the three preceding
commits before merge.
Scheduler test (code-review C2): tie the regression assertion to the
actual failure mode via caplog. Without this, the test depended on
``await_count`` as a second-order signal; the new check catches
``DetachedInstanceError`` directly so a future refactor that changes
the send count for unrelated reasons can't silently mask the bug.
Guard layer consistency (code-review I4): new test
``test_publisher_partner_not_locked_at_model_layer`` exercises a
PublisherPartner write on an embedded tenant without the
``management_api_caller`` bypass. If a future change adds the model to
``embedded_tenant_guard``'s locked set, this test fails with a pointer
to remove ``allow_embedded_writes=True`` from the four
publisher_partners routes. Companion note added in
embedded_tenant_guard.py near the existing locked-table listeners.
Display-name length cap (security nit 1): add a 255-char gate on
``display_name`` in ``add_publisher_partner`` so a hostile or buggy
embedded caller can't persist multi-MB strings that later render into
admin UI / API responses.
Filed #391 for the systemic scoped_session/nested-get_db_session()
trap surfaced during scheduler triage (code-review C1). The scheduler
fix in ac3a3a7b is the right immediate patch; the underlying trap
needs its own redesign and is tracked separately.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#357): use url_for() for tenant admin links so embedded-mode mounts work (#393)
* fix(#357): use url_for() for tenant admin links so embedded-mode mounts work
The Setup Checklist (and other admin surfaces) emitted bare /tenant/<id>/...
hrefs. Under the Storefront's /storefront/salesagent mount, those resolved
against the storefront host instead of the proxied salesagent path and
returned 404 from the parent app.
Service layer (setup_checklist_service.py, dashboard_service.py,
business_activity_service.py) now builds URLs via flask.url_for() so the
emitted hrefs include SCRIPT_NAME automatically. Templates that previously
hand-prepended {{ request.script_root }} are migrated to url_for() in the
same pass for consistency with CLAUDE.md Pattern #6.
SetupChecklistService runs from two transports: Flask (admin UI) and
Starlette via adcp.server.serve() (MCP/A2A). validate_setup_complete()
fires inside _create_media_buy_impl on the non-Flask path, where url_for()
would raise RuntimeError. _build_url() catches that and returns None;
validate_setup_complete only reads task['name'], so the gate behavior is
unchanged. Added tests/unit/test_setup_checklist_no_flask_context.py to
pin this contract.
JS string interpolations (`${tenantId}/...`) are intentionally untouched
— url_for can't help with runtime IDs, and CLAUDE.md Pattern #6 already
endorses `scriptRoot + path` for that case.
One pre-existing FIXME left: tenant_settings.html /settings/raw form
posts to a route that has no handler; touched only with a clarifying
comment, not a behavior change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#357): also tolerate BuildError when url_for runs in a foreign Flask app
The tenant_management_api blueprint runs as its own Flask app
(tests/integration/test_managed_tenant_api.py:54 — a bare Flask()
with only that blueprint registered). When SetupChecklistService is
invoked from that app (via tenant_status_service → /status), url_for()
for admin-UI endpoints raises werkzeug.routing.BuildError, not
RuntimeError, because the endpoint isn't registered there.
_build_url now catches both. The management API never reads
action_url (it surfaces configure_path from a static map in
tenant_status_service._CONFIGURE_PATHS), so None is correct.
Adds TestServiceWorksInForeignFlaskApp to pin the contract — Flask
context exists, but the endpoint can't be built.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(deps): bump adcp 5.3.0 → 5.4.0; drop three local workarounds (#394)
* chore(deps): bump adcp 5.3.0 → 5.4.0; drop three local workarounds
5.4.0 ships every upstream issue I filed yesterday plus a bonus.
## Drops with the bump
| Workaround | Upstream fix |
|---|---|
| ``core/middleware/www_authenticate.py`` (74 LOC) — injected ``WWW-Authenticate: Bearer`` on 401 because the MCP-leg ``BearerTokenAuthMiddleware._unauthenticated`` returned ``JSONResponse(status_code=401)`` without the header | adcp-client-python#712 → #715: ``_unauthenticated`` now emits the header on both the MCP dispatch path AND the ASGI ``_send_unauthenticated`` path, matching what the A2A leg already did |
| ``core/idempotency._ReplayMarkingStore`` (~120 LOC + private-symbol coupling to ``_WRAPPED_FUNCTIONS`` / ``_clone_response`` / ``_resolve_call_args`` / ``_to_dict`` / ``CachedResponse``) — reimplemented the full ``IdempotencyStore.wrap`` body inline to inject ``replayed: true`` on cache hits per AdCP L1/security rule 4 | adcp-client-python#714 → #717: ``IdempotencyStore.wrap`` now does ``response["replayed"] = True`` on the cache-hit branch natively |
| ``mcp_header_name="x-adcp-auth"`` + ``mcp_bearer_prefix_required=False`` — the MCP leg accepted ONLY the custom legacy header, EXCLUDING the spec-canonical ``Authorization: Bearer``. Caused ``security_baseline/probe_api_key`` storyboard failures | adcp-client-python#720 → #721: ``Authorization: Bearer`` is always accepted; ``mcp_legacy_header_aliases=[...]`` is a purely additive opt-in for adopters with deployed legacy clients |
Net diff: -533 LOC including the obsolete test files.
## Auth shape after bump
Old (broken for spec-compliant clients):
```python
BearerTokenAuth(
validate_token=_validate_token,
mcp_header_name="x-adcp-auth",
mcp_bearer_prefix_required=False,
)
```
New (spec compliance + zero break for legacy clients):
```python
BearerTokenAuth(
validate_token=_validate_token,
mcp_legacy_header_aliases=["x-adcp-auth"],
)
```
``Authorization: Bearer <token>`` is now accepted on both legs by default
(the spec carrier per RFC 6750). The ``x-adcp-auth`` legacy header keeps
working unchanged for any early-adopter MCP client still on it. Migration
is a one-way drift with no flag day.
## Files removed
- ``core/middleware/www_authenticate.py``
- ``core/tests/test_idempotency_replay_marking.py``
- ``tests/unit/test_www_authenticate_middleware.py``
- The ``WWWAuthenticateMiddleware``-ordering test in
``tests/unit/test_serve_kwargs_middleware_order.py``
## Verification
- ``make quality``: 4294 passed, 14 skipped, 19 xfailed
- Existing ``_ReplayMarkingStore`` callers in ``get_idempotency_store()``
swapped to plain ``IdempotencyStore`` — same constructor signature,
upstream provides the injection
- ``test_serve_kwargs_middleware_order.py`` updated to drop the
``WWWAuthenticateMiddleware``-position pin (middleware no longer
exists)
- After deploy, the compliance probe's ``security_baseline/probe_api_key``
and ``assert_mechanism`` storyboard steps should flip to pass — closes
the auth-header gap we filed as bokelley/salesagent#386 (which can now
be closed as "fixed upstream")
## Closes (when deployed)
- bokelley/salesagent#386 — multi-header auth (now native via #720)
- The remaining compliance-probe residual on ``security_baseline``
(3 → 1 failure; only ``proposal_finalize/create_media_buy`` left,
tracked separately as #387)
## Doesn't pick up
- 5.4.0's ``LazyPlatformRouter.proposal_stores=`` / ``proposal_store_factory=``
(#722/#724) — this is the wiring point for bokelley/salesagent#387.
Separate PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* review(deps): switch test fixtures to new BearerTokenAuth shape (PR #394 nit)
Code-reviewer flagged that two test files still constructed
``BearerTokenAuth`` with the legacy ``mcp_header_name`` /
``mcp_bearer_prefix_required`` kwargs even though production swapped
to ``mcp_legacy_header_aliases=[...]`` in this PR's main commit. The
tests passed against 5.4.0 (back-compat shim works) but emitted
``DeprecationWarning`` and stopped mirroring the production config —
production now ACCEPTS ``Authorization: Bearer`` on the MCP leg
alongside ``x-adcp-auth``, but these test fixtures still wired the
old exclusive-header semantics.
## Changes
* ``tests/unit/test_per_leg_bearer_auth.py``: ``_production_auth`` and
``_build_mcp_app`` updated to the new shape. The inner
``BearerTokenAuthMiddleware`` construction now passes
``legacy_header_aliases=auth.resolved_mcp_legacy_aliases()`` and
``legacy_aliases_bearer_prefix_required=auth.legacy_aliases_bearer_prefix_required``
in place of the deprecated ``header_name`` / ``bearer_prefix_required``
pair. Mirrors what adcp.server.serve._wrap_mcp_with_auth does
natively against 5.4.0.
* ``tests/unit/test_agent_card_auth_scheme.py``: ``_production_auth``
same swap; module-level docstring updated to reflect that both legs
now default to ``Authorization`` and the MCP leg additively accepts
``x-adcp-auth`` for legacy adopters (not as an exclusive override).
## Verification
* 10/10 targeted tests pass
* ``make quality`` — 4294 passed, 14 skipped, 19 xfailed; warning
count dropped from 117 to 105 (the deprecation warnings are gone)
No behavior change beyond what PR #394's main commit ships. Tests now
exercise the same wire shape production uses.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(logs): strip debug instrumentation + collapse audit fan-out (#395)
Production log volume on Fly was 70%+ noise. Three categories of offender,
all in our code (not the adcp SDK):
src/core/context_manager.py
- Delete leftover ``console.print`` debug blocks: 🔍 PRE-COMMIT WEBHOOK
DEBUG (16 lines per workflow step update), 🔍 POST-COMMIT WEBHOOK DEBUG
(5 lines), and the 🚀 WEBHOOK / ⚠️ WEBHOOK SKIPPED chatter. These read
as leftover instrumentation from a past debugging session and fire on
every workflow step status change.
- Convert remaining ``console.print`` calls to ``logger.debug`` (lifecycle
events: context/step creation, object linking, webhook dispatch) or
``logger.warning`` / ``logger.exception`` (errors).
- Drop the unused ``rich.console.Console`` import and module-level
``console = Console()`` singleton.
src/core/helpers/adapter_helpers.py
- Demote ``[ADAPTER_SELECT]`` / ``[ADAPTER_CONFIG]`` from ``logger.info``
to ``logger.debug`` (10 call sites). These are tracing-grade fields,
not operational signals — they should be off in production unless
someone is actively debugging adapter selection.
src/core/audit_logger.py
- Collapse the per-detail audit fan-out: instead of one ``logger.info``
per ``details`` dict key (N+1 lines per audit event), emit a single
``"<message> | <details>"`` line. Full details still persist to
``AuditLog.details`` for structured queries — the per-line fan-out was
legible in local tail but flooded production stdout.
The ``adcp.audit`` logger name is shared with the SDK's ``LoggingAuditSink``
but the noisy emissions come from our own ``audit_logger`` in
``src/core/audit_logger.py``, not the SDK — so nothing to file upstream.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(logs): drop /mcp+/health access spam, rate-limit geo warning, fix trafficker_id log bug (#397)
Three followups from the audit pass on production fly logs.
src/core/logging_config.py
- Add ``UvicornAccessNoiseFilter`` and attach it to ``uvicorn.access`` in
both production (JSON) and development (standard) modes. The filter
drops 2xx GET/POST/HEAD/OPTIONS access lines on /mcp[/] and /health —
the two endpoints hit constantly by storefront MCP pollers and Fly's
TCP+HTTP health checks. 4xx/5xx still surface so auth failures and
server errors aren't buried. Other paths (admin UI, /a2a, /.well-known,
/mcp-debug, etc.) are unaffected. Behavioral contract pinned by 18
parametrized tests in tests/unit/test_uvicorn_access_filter.py.
src/adapters/gam/managers/targeting.py
- Rate-limit the "Could not load geo mappings file" + "Using empty geo
mappings" warnings to once per process lifetime via a module-level flag.
Each ``GAMTargetingManager`` instance fires on every adapter selection,
so the same warning flooded the log on every GAM-tenant request. The
underlying file-not-found is still tracked in #396 (it means GAM geo
targeting silently produces empty results in prod and needs a
packaging-side fix).
src/adapters/google_ad_manager.py
- Fix the "Could not auto-detect trafficker_id: User instance has no
attribute 'get'" warning. The googleads SOAP client returns a zeep
complex object — it supports __getitem__ and attribute access but NOT
``.get()``. The old code called ``current_user.get('name', 'Unknown')``
inside the success-log f-string, which raised AttributeError AFTER
``self.trafficker_id`` was already assigned. The ID was being detected
correctly all along; only the success log was failing and producing a
misleading warning on every request. Switched to ``getattr`` for the
optional ``name`` field.
Filed #396 to track the underlying production OOM kill on the iad
machine and the missing ``gam_geo_mappings.json`` packaging issue — both
infrastructure-level and out of scope here.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(proposal): wire Postgres-backed ProposalStore for create_media_buy(proposal_id=…) (#390)
* feat(proposal): wire Postgres-backed ProposalStore for create_media_buy(proposal_id=…)
Closes the proposal-lookup gap that made
``proposal_finalize/create_media_buy`` fail with
``INVALID_REQUEST: Invalid budget: 0.0``. Without a wired
:class:`ProposalStore`, the framework's ``proposal_dispatch`` had no
backing for the buyer's ``proposal_id`` and ``create_media_buy``
landed in package-derivation with zero packages.
Pieces:
- ``proposals`` table (migration ``r0s1t2u3v4w5``) — mirrors the
v1.5 ``ProposalRecord`` dataclass with multi-tenant scoping and a
partial unique on ``(account_id, media_buy_id) WHERE media_buy_id
IS NOT NULL`` for reverse-index lookups
- :class:`SalesAgentProposalStore` — implements every
:class:`adcp.decisioning.proposal_store.ProposalStore` Protocol
method (put_draft / get / commit / try_reserve_consumption /
finalize_consumption / release_consumption / mark_consumed /
discard / get_by_media_buy_id) against the new table. Atomic CAS
via ``SELECT … FOR UPDATE`` serializes parallel callers.
Cross-tenant probes collapse to ``None`` / ``PROPOSAL_NOT_FOUND``
per the Protocol's principal-enumeration defense.
- :class:`_LazyPlatformRouterWithStore` — thin subclass that adds
the ``proposal_store_for_tenant`` accessor the framework's
``proposal_dispatch`` duck-types. Upstream
:class:`LazyPlatformRouter` doesn't expose it (only the eager
:class:`PlatformRouter` does, via ``proposal_stores=``).
- Wired into ``build_router()`` — single shared store across
tenants; isolation runs inside the store on
``expected_account_id``.
v1 lifecycle compromise (documented in the store docstring): the
storyboard flow goes brief → create_media_buy WITHOUT an
intermediate finalize step, but the framework's
:meth:`try_reserve_consumption` requires the proposal to be in
``committed`` state. The store auto-commits at ``put_draft`` time
with a 7-day ``expires_at`` so the buyer flow unblocks today. The
Protocol surface is unchanged — only the internal lifecycle state
differs. When the manager declares ``finalize=True`` in v2, swap
to canonical ``draft`` + explicit commit.
Tests:
- ``tests/integration/test_proposal_store.py`` — 15 integration
tests against real Postgres covering put_draft auto-commit,
payload round-trip, refine overwrite, cross-tenant probe defense
(get + try_reserve), two-phase consumption lifecycle, atomic CAS
double-reservation rejection, reverse-index lookup with
``expected_account_id`` enforcement, idempotent release/discard
- ``tests/unit/test_lazy_router_with_proposal_store.py`` — 3 unit
tests pinning the router subclass's accessor wiring
- ``tests/unit/test_proposal_store_attributes.py`` — 2 unit tests
pinning ``is_durable=True`` (production-mode gate) and the
7-day default hold window
Refs #387
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(proposal): adopt adcp 5.4 — drop workarounds, use upstream surface
Upstream shipped both items we filed during #387:
- adcp-client-python#722 → 5.4: LazyPlatformRouter accepts
``proposal_stores=`` and ``proposal_store_factory=``. Deletes our
``_LazyPlatformRouterWithStore`` subclass.
- adcp-client-python#723 → 5.4: ``ProposalCapabilities.auto_commit_on_put_draft``
shipped option B from the issue. The framework now calls
``store.commit`` immediately after ``put_draft`` for opted-in
managers. Deletes our store-side ``state=COMMITTED`` workaround
in ``put_draft``.
Migration:
- Bump ``adcp>=5.4.0``.
- ``SalesAgentProposalManager.capabilities`` declares
``auto_commit_on_put_draft=True``; framework owns the
DRAFT → COMMITTED promotion via
``auto_commit_ttl_seconds=604800`` (7-day default, matches our
prior store-side hold window).
- ``core/main.build_router`` calls ``LazyPlatformRouter(...)`` directly
with ``proposal_store_factory=lambda _tid: shared_store``. Factory
shape over eager dict because the store is a single shared
instance — eager dict would force boot-time tenant enumeration
and miss tenants registered after boot.
- ``SalesAgentProposalStore.put_draft`` writes spec-canonical
``draft`` state with ``expires_at=None``. The ``_committed_hold``
constructor param and the 7-day default are gone — the framework's
``auto_commit_ttl_seconds`` capability owns the TTL.
Tests:
- Integration: 16 tests rewritten — put_draft asserts DRAFT (not
COMMITTED), reservation lifecycle tests use a ``_put_and_commit``
helper that mirrors the framework's auto-commit dispatch, new
``TestCommit`` class covers commit promotion + idempotency +
payload-drift rejection, new test pins that put_draft on a
COMMITTED record raises ``INTERNAL_ERROR`` per Protocol.
- Unit: deleted ``test_lazy_router_with_proposal_store.py`` (no
subclass to test); trimmed ``test_proposal_store_attributes.py``
to the durability flag only (the 7-day default belongs to the
framework now).
Refs #387
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(proposal): address review — split compound account_id, account-scoped locks, fail-closed unscoped methods
Review feedback on PR #390:
**B1 (blocker): _resolve_tenant_id_for_account returned account_id
verbatim.** SalesagentAccountStore.resolve mints ``f"{tenant_id}:{ref}"``
(``ref`` defaults to ``"default"``; storyboard runs use ``"acct_demo"``).
The framework passes ``ctx.account.id`` straight into ``put_draft``,
so every prod ``put_draft`` would FK-violate on ``proposals.tenant_id``.
Fixed: split on ``":"`` and take the prefix. New integration test
``test_put_draft_handles_compound_account_id`` regresses this — uses
the real shape the framework emits.
**Security MAJOR (×3): try_reserve / finalize / release did
SELECT FOR UPDATE then filtered account_id in Python.** Cross-tenant
probes acquired the row lock, leaking existence via timing AND
providing a DoS primitive against legitimate same-tenant operations.
Fixed: ``account_id`` moved into the WHERE clause so cross-tenant
probes never acquire the lock. Two new integration tests pin the
behavior:
- test_finalize_cross_tenant_collapses_to_internal_error
- test_release_cross_tenant_is_noop (verifies foreign tenant's
release doesn't roll back the owner's CONSUMING reservation)
**Security MAJOR (×2): discard() and mark_consumed() Protocol
signatures lack ``expected_account_id``.** Any caller obtaining a
``proposal_id`` could destroy / terminate another tenant's proposal.
Neither is called by adcp 5.4's ``proposal_dispatch`` today; fixed:
both raise ``NotImplementedError`` with an ERROR log. Future framework
versions that begin calling them surface loudly before reaching prod.
Two new tests pin the fail-closed behavior.
**MAJOR M3: _serialize_recipes silently passed dicts through.**
Violates "No quiet failures" (CLAUDE.md). Fixed: raises TypeError on
non-Pydantic input — caller has to pass typed Recipe instances.
**MINOR m3: lazy imports inside every method.** Hoisted
``ProposalRecord``, ``ProposalState``, ``AdcpError`` to module level —
no circular import; the salesagent already imports the library
at module-load time elsewhere.
**NIT n2/n3: stale temporal references.** Dropped "v1 auto-commit
workaround landed before #723 and is gone" from the store docstring
and "v1 auto-commits at put_draft time" from the Proposal model
docstring. Per CLAUDE.md: don't document the prior behavior.
**M2 partial coverage: end-to-end account_id shape test added.**
``test_put_draft_handles_compound_account_id`` exercises the realistic
``"tenant_id:default"`` shape the framework actually emits. Full
end-to-end (HTTP → proposal_dispatch → store) deferred to compliance
probe post-deploy — the unit layer pins every store-side invariant.
24 integration + unit tests pass; ``make quality`` clean (4311 tests).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* review(proposal): expires_at guard + 8 lock-in tests cherry-picked from PR #398
Two additions from @bokelley's parallel #398 work that #390 lacked:
**1. Defense-in-depth expires_at check inside try_reserve_consumption.**
Security reviewer L1 finding on #398: a buyer holding a COMMITTED
proposal past its ``expires_at`` could reserve and finalize
indefinitely. The framework's
``proposal_dispatch._hydrate_proposal_context`` checks expiry on the
get-side, but ``try_reserve_consumption`` is reachable from dispatch
paths that bypass that filter (and from adopter callers that go
straight to the store). New three-line guard inside the existing row
lock raises ``PROPOSAL_EXPIRED`` with ``recovery="correctable"``.
Mirrors upstream :class:`InMemoryProposalStore._evict_expired_locked`
but surfaces the event rather than silently deleting so audit trails
survive.
**2. mark_consumed restored as implemented Protocol method.** Earlier
fail-closed pattern was over-cautious for a Protocol method the
framework doesn't currently call. Now matches the upstream
:class:`InMemoryProposalStore.mark_consumed` shape verbatim, with a
WARNING audit log on every call so unexpected invocations are
visible. Documented Protocol-signature gap (no
``expected_account_id``) — same upstream constraint that
:meth:`discard` has; ``discard`` stays fail-closed because the user's
follow-up list didn't include it.
**Tests (9 added, 1 replaced):**
- test_reserve_past_expires_at_raises_expired (locks in #1)
- test_release_silent_no_op_on_missing
- test_release_silent_no_op_on_cross_account
- test_finalize_idempotent_on_consumed_matching_media_buy
- test_finalize_mismatched_media_buy_raises
- test_mark_consumed_promotes_to_consumed
- test_mark_consumed_idempotent_on_matching
- test_mark_consumed_mismatched_raises
- test_mark_consumed_unknown_raises_internal_error
- Replaced ``test_mark_consumed_raises_not_implemented`` with the
four ``TestMarkConsumed`` cases above
All cherry-picked from #398's test suite (locked-in shapes already
correct in #390's code per @bokelley's close comment). 32 integration
+ unit tests pass; ``make quality`` clean (4311 tests).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(types): adopt SchemaVariant for 12 cross-class schema overrides (#400)
adcp 5.4.0 #718 ships ``SchemaVariant[T]`` + a mypy plugin that
rewrites the annotation to ``Any`` for override-compat purposes,
retiring the ``# type: ignore[assignment]`` stamps adopters used
to carry on cross-class entity overrides.
The 12 sites in src/core/schemas/ all match the cross-class pattern
the marker targets:
- 4× geo_*_exclude — parent declares Geo{Country,Region,Metro,
PostalArea}ExcludeItem; we substitute the inclusion variant
- 2× creatives — parent declares CreativeAsset; we substitute
our extended Creative
- 1× deployments — parent declares Deployments; we substitute
SignalDeployment
- 1× media_buys — parent declares MediaBuy; we substitute
the GetMediaBuysMediaBuy delivery-context view
- 1× ext — parent declares ExtensionObject; we use dict
- 1× sync_creatives.creatives — parent's CreativeAsset; we
use our local CreativeAsset subclass
- 1× query_summary — parent's QuerySummary; we use our local
- 1× media_buy_deliveries / 1× creatives in delivery.py —
delivery-context views
mypy.ini gets ``adcp.types.mypy_plugin`` added to the plugins
line alongside the existing sqlalchemy + pydantic plugins.
Tradeoff (documented upstream): inside the override, mypy sees the
field as ``Any``. ``typing.cast(list[T], self.field)`` recovers
precise inference at call sites that need it. None of the touched
sites currently rely on inside-override inference at usage sites,
so no cast() is needed for this change.
make quality: 4319 passed, 14 skipped, 19 xfailed.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: drop SchedulerLifespanMiddleware, use serve(on_startup=, on_shutdown=) (#401)
adcp 5.4.0 #713 ships native lifespan hooks on ``serve(transport='both')``,
which is exactly what the middleware was hand-rolling. The middleware
intercepted ASGI ``lifespan.startup`` / ``lifespan.shutdown`` scope events
to fire scheduler start/stop coroutines because earlier SDK versions
didn't expose a user-supplied lifespan extension point.
Now they do. The SDK's ``on_startup`` / ``on_shutdown`` kwargs take the
same ``Callable[[], Awaitable[None]]`` shape that ``_start_schedulers``
and ``_stop_schedulers`` already had, so the swap is mechanical:
- Drop ``SchedulerLifespanMiddleware`` from the ``asgi_middleware`` list.
- Pass ``on_startup=[_start_schedulers]`` / ``on_shutdown=[_stop_schedulers]``
in ``_serve_kwargs()``, conditional on ``include_scheduler`` (tests still
skip).
- Delete ``core/middleware/scheduler_lifespan.py`` (61 LOC).
- Update the ``_serve_kwargs`` docstring to reference the SDK hook instead.
The middleware ran scheduler shutdown with a 10s ``asyncio.wait_for``
guard; the SDK fires hooks unguarded. Our ``_stop_schedulers`` already
caps its own awaitables (delivery + media-buy status schedulers each
join their internal task groups with a bounded timeout), so dropping
the outer wait_for is fine — it was defensive double-bookkeeping.
make quality: 4319 passed, 14 skipped, 19 xfailed.
Closes the second of three local rip-outs unlocked by the adcp 5.4.0
bump. The third (AgentCardPublicUrlMiddleware → public_url callable)
lands separately.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: swap AgentCardPublicUrlMiddleware for public_url callable (#402)
The salesagent middleware existed because earlier SDK versions either
hardcoded ``http://localhost:{port}/`` into the agent card with no
override hook (pre-5.0) or crashed ``transport='both'`` startup when
``public_url`` was a callable (5.2.0, ``AttributeError: 'function'
object has no attribute 'router'``).
adcp 5.3.0 #680 fixed the composed-lifespan crash. 5.4.0 has confirmed
the callable path works under ``transport='both'`` in production. The
SDK's ``serve(public_url=PublicUrlResolver)`` is now the right primitive
for per-request agent-card URL derivation.
## What lands
- ``core/main._resolve_public_url(request) -> str`` — pure function
with the same header-precedence rules the middleware enforced:
PUBLIC_URL env > X-Forwarded-Host > Host, X-Forwarded-Proto for
scheme, ``http://`` for loopback / ``https://`` otherwise.
- Wired as ``"public_url": _resolve_public_url`` in ``_serve_kwargs``.
- Drop the middleware from the ``asgi_middleware`` list.
- Delete ``core/middleware/agent_card_public_url.py`` (189 LOC).
- Replace ``test_agent_card_public_url_middleware.py`` with 13 tests
of the new resolver covering: X-Forwarded-Host precedence, Host
fallback, comma-chain stripping, proto override, https default,
loopback http exception (matches SDK's ``_validate_card_url``),
PUBLIC_URL env override, no-headers fallback.
- Update ``test_serve_kwargs_middleware_order`` — replace the
middleware-present assertion with a ``public_url is callable``
assertion.
## Net diff
-442 LOC…
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.